Skip to main content

30 posts tagged with "ghc"

View All Tags

· 2 min read

Biweekly update from the GHC DevX team at IOG.

Previous updates can be found here.

JavaScript backend

  • Sylvain: updated the MR implementing Template Haskell for the JS backend !9779 to start fixing the remaining recompilation avoidance issues.

  • Sylvain: prepared slides for the GHC workshop about the JS backend. See you there next week!

  • Luite: Continued working on support for the process package with the JS backend, moving JavaScript specific functionality into the System.Process.JavaScript module. Most functionality works, but sometimes tests hang, which seems to be caused by nodejs being inconsistent emitting events on child process creation.

  • Josh: merged changes to the JavaScript code rendering. Previously, we used GHC's SDoc system, which included a hack of using the layLeft function to remove indentation from rendered code to improve code size. Now, code is instead rendered on one line using GHC's new HLine system. HLine doesn't have to waste time on indentation logic, and instead simply concatenates very quickly. It also renders directly to a file handle, rather than spending memory on an intermediate ByteString.

    Additionally, the flag -ddisable-js-minifier is added in this MR, which causes JS code to be rendered with full human-readable indentation and whitespace.

    See: GHC!10500

  • Josh: fixed an issue where ghc --supported-extensions was incorrectly listing JavaScriptFFI. This issue had a workaround in Cabal for several years, but the workaround was causing issues for Cabal support in the JavaScript backend. Now, the extension is removed from non-JavaScript targets, allowing the Cabal workaround to be reverted for new versions of GHC.

    See:

Deprecation warnings

  • Bartek: Deprecated exports proposal MR passes CI and has all the issues addressed, waiting for the MR to be accepted See GHC!10283

  • Bartek: Slight rework of parsing and storing the custom warnings so that it's more generalized (for the upcoming Deprecated Instances) See GHC!10531

  • Bartek: Parsing for the Deprecated Instances implemented, still waiting for the proposal to be accepted

Miscelleaneous

· 2 min read

Biweekly update from the GHC DevX team at IOG.

Previous updates can be found here.

JavaScript backend

  • Josh: Merged a port of clock_gettime from GHCJS into the JavaScript backend. This function is required by some Cabal Setup.hs programs, so this brings us closer to full Cabal functionality, as well as Unix parity. See GHC!10396

  • Luite: Ported the patch to support the process package in the GHC JavaScript backend from GHCJS. The old code was never fully tested and only intended to support Cabal, so there is missing functionality and error reporting is lacking. Work is now on the way to bring it up to level where it can be merged in the process package.

  • Luite: Looked into the test failures for the JavaScript specific weak references. It turned out to be caused by messing deadlock detection. Experimented with making the JavaScript storage manager (FinalizationRegistry) work for deadlock detection,add but it requires signifcant changes to the representation of ThreadId# in the Haskell code. For GHC 9.8 we will most likely document that deadlocks are not guaranteed to be detected with the JavaScript backend, and give the user the option to manually run full deadlock detection (heap scan) if desired.

  • Sylvain: fixed support for the ghcjs_HOST_OS CPP conditional. See GHC#22346 and GHC!10430.

  • Sylvain: fixed getpid. See GHC#23399 and GHC!10454.

Deprecation warnings

  • Bartek: changed the way warning reporting from addUsedGRE was done from artificially removing all warning categories from dynflags to an explicit argument See GHC!10414

  • Bartek: almost finished the deprecated exports proposal, a bug left to fix in backpack See GHC!10283

Performance

  • Josh: Merged improvements to the implementation of handle encoders after approval of the CLC. By manually unboxing functions stored in the CodeBuffer record, the allocations caused by allocations are reduced by ~20%, and general GHC usage is also slightly improved. See GHC!9948 and CI Performance Tests

Linker

  • Sylvain: backported fix for GHC#19421 (internal error: m32_allocator_init: Failed to map) to 9.2. See GHC!10453

Misc.

· 2 min read

Biweekly update from the GHC DevX team at IOG.

Previous updates can be found here.

JavaScript backend

  • Luite: has published a blog post about how Stacks work in the JS backend: https://engineering.iog.io/2023-04-21-stacks-in-the-js-backend

  • Jeff: GHC!10260, which adds a small optimizer to the JS backend, has landed in master.

  • Josh & Sylvain: fixed the -fcheck-prim-bound feature for the JS backend. See GHC#23123 and GHC!10234

  • Luite: improved gc pause times for the JS backend for programs that do not use weak references, by omitting heap scans if there are no pending finalizers. See GHC!10379

  • Josh: refactored some code related to saturated/unsaturated ASTs:

    • First, the identsS/identsE/identsV functions were changed to take a Sat.JStat instead of an Unsat.JStat as an argument, since they previously would raise an error when encountering an unsaturated constructor (Sat.JStat and Unsat.JStat are identical other than this constructor) GHC#23304GHC!10348
    • Then, jsSaturate and satJStat were combined into one function. Previously, jsSaturate would instantiate unsaturated constructors in a JStat AST, but retain the Unsat.JStat type, and satJStat would separately traverse the AST to make the simple type conversion. Since jsSaturate was never used without calling satJStat directly after, these can be combined to traverse the AST and instantiate unsaturated constructors in one pass. GHC#23328GHC!10369
  • Josh: Added the type of an STG closure to various STG data types so that it's passed through GHC.CoreToStg.PreStgRhs and GHC.CoreToStg.StgRhsClosure. With this, we no longer have to guess types based on other information available during JavaScript code generation. This fixes some issues that were revealed in the testsuite. GHC!10261

GHC Proposal #0134 (Deprecated Exports)

  • Bartek: found a bug in warnings about duplicate exports when exporting T(..). See GHC#23318

  • Bartek: Finished exporting part of the renamer for the Deprecating Exports proposal. See GHC!10283

Performance

  • Jeff: First case study on performance engineering for the Haskell Optimization Handbook is done and has been published.

  • Josh: Added a GHC testsuite performance test for measuring memory allocations of encoding. GHC!10347

Modularity

Tooling

· 19 min read

Introduction

I recently gave a short presentation on the topic of stacks in the GHC JavaScript backend to the GHC team at IOG. This blog post is a summary of the content.

In the context of a program produced by the GHC JavaScript backend, two different types of stack exist: The JavaScript call stack and Haskell lightweight stacks. In this post we will focus mostly on the lightweight stacks.

First we will see why using only the JavaScript call stack is not suitable for running compiled Haskell code. Then we will introduce the calling convention we use for Haskell and see how the lightweight stacks are used for making calls and passing around data. After this, we will explore in more detail how they are used for exception handling and multithreading.

Lightweight Stacks

GHC's JavaScript backend translates Haskell, via the intermediate language STG, to JavaScript functions and data. When compiled Haskell code needs to run a computation from a different part of the code, it has to call into the corresponding (JavaScript) function. The code relies on tail calls a lot, and for this reason we cannot use a regular JavaScript function call: JavaScript does not support tail-call optimization and we would run out of stack space quickly.

Instead, we use a technique known as "trampolining" to avoid using space on the JavaScript call stack.

Trampolining

When using trampolining to make a tail call, we don't directly make a function call (which would use JavaScript stack space), instead we return the function itself, and let a "scheduler" do the call for us. This is easiest to show with an example:

// example without trampolining
function example1_direct() {
var arg1 = ... // compute arg1

// make a tail call to xyz (using space on the JavaScript call stack)
return xyz(arg1);
}

example1_direct above uses a direct call to xyz. If we change it to use trampolining to call xyz it looks as follows:

// simplified scheduler function
function scheduler(c) {
// scheduler trampolining loop
while(true) {
c = c();
}
}

// example1 is called from the scheduler loop
function example1_trampoline() {
var arg1 = ... // compute arg1

// use global "register" variable h$r1 for the argument
h$r1 = arg1;

// return a reference to the function to make the call
return xyz;
}

example1_trampoline only works if it's called from the scheduler loop. The scheduler loop keeps calling functions (c = c();), so all calls are made directly from the scheduler function, no growing JavaScript stack.

The scheduler loop cannot deal with function arguments, it calls each c without any arguments. There's no way to efficiently return both a function and arguments to scheduler in JavaScript. Allocating an object wrapping the function and arguments for each call would be prohibitively expensive. Therefore we rely on global variables h$r1, h$r2, h$r3, ..., which we refer to as "registers", to pass the arguments to xyz.

Continuations

Other situations are a bit more complicated, for example if our code needs to do something with the result of the call, like pattern matching on a data constructor or primitive value. Let's extend our example a little bit:

function abc_direct(arg1) {
var result = ... // compute result
return result;
}

function example2_direct() {
var arg1 = ... // compute arg1

var r = abc_direct(arg1);
if(r > 0) {
return 1;
} else {
return 0;
}
}

Here the call to abc_direct is not a tail call; example2_direct has to do something with the result of the call. To make a trampolining version out of example2, we need to consider the following:

  • abc has to return a value to the function that called it
  • example2 needs to inspect the value returned by abc.

Returning the result value directly from abc is not possible: It would end up in the c = c(); loop in scheduler. We need to return a value, but we can only return a function to make a tail call. Where do we get this function?

This is where the lightweight Haskell stack comes into play. We use the convention that when we are done computing, we "return" the result by making a tail call to a continuation (function) at the top of the lightweight stack, passing the result as an argument.

The stack is stored in the global variable h$stack, which is a JavaScript array, and we have a global integer variable h$sp which represents the index of the top of the stack. This means that our continuation is found at h$stack[h$sp], the top of the stack. We call it the usual way: Returning the function to the trampoline and using registers (h$r1, h$r2, ...) for the arguments.

Using this convention, example2 becomes:

function abc_trampoline() {
var arg1 = h$r1;
var result = ... // compute result

// call the continuation on the stack with the result
h$r1 = result;
return h$stack[h$sp];
}

function example2_trampoline() {
var arg1 = ... // compute arg1

// push our continuation
h$sp++;
h$stack[h$sp] = example2_cont;

// tail-call abc through trampoline
h$r1 = arg1;
return abc_trampoline;
}

function example2_cont() {
// pop this continuation from the stack
h$sp--;

// the result is the first register argument
var r = h$r1;

if(r > 0) {
h$r1 = 1;
} else {
h$r1 = 0;
}

// make the tail call with the result in h$r1
return h$stack[h$sp];
}

Having a stack of continuations allows calls of arbitrary depth to be composed.

Stack Frames

Sometimes just storing continuations isn't enough. Suppose that we need to use some data in our continuation, for example:

function example3_direct() {
var arg1 = ... // compute arg1
var a = ... // compute a
var b = ... // compute b

var r = abc_direct(arg1);

if(r > 0) {
return a;
} else {
return b;
}
}

Here we need to make sure that a and b are available in the continuation. To do so, we save them on the stack as follows:

function example3_trampoline() {
var arg1 = ... // compute arg1
var a = ... // compute a
var b = ... // compute b

// push our continuation, saving a and b
h$sp += 3;
h$stack[h$sp-2] = a;
h$stack[h$sp-1] = b;
h$stack[h$sp] = example3_cont;

// tail-call abc through trampoline
h$r1 = arg1;
return abc_trampoline;
}

function example3_cont {
var r = h$r1;

// restore a and b from stack
var a = h$stack[h$sp-2];
var b = h$stack[h$sp-1];

// pop the stack frame (3 slots: a, b, example3_cont)
h$sp -= 3;

// return our result through the trampoline
if(r > 0) {
h$r1 = a;
} else {
h$r1 = b;
}
return h$stack[h$sp];
}

Now the stack consists of JavaScript functions (continuations), each followed by zero or more slots of data. We refer to the JavaScript function as the header and the data as the payload. The header and payload combined are called a stack frame.

Here is an example of what h$stack might look like during the execution of example3_trampoline:

SlotValueFrameType / Description
0h$done1function, "finished" frame
102number, flags
2h$...zireportError2object, exception handler
3h$catch_e2function, exception catch frame
443value of a (in example3 above)
553value of b (in example3 above)
6example3_cont3function, example3_cont frame

Frame 3 is the example3_cont frame with two slots of payload, pushed by our example. Frames 2 and 1 are added by the runtime system. We will discuss them in more detail later.

Stack Frame Metadata

During normal execution of a program, the code that manipulates the stack has knowledge of the specific stack frame it's working with: It knows the size of the payload and which values are stored in each stack slot. However, there are also operations that require dealing with all kinds of unknown stack frames. Exceptions and software transactional memory are the most important ones.

These operations deal with unknown stack frames and sometimes need information about the frame, for example the frame size. Where do we store it?

Every JavaScript function is also an object. This means that we can store arbitrary data in the function's properties. For example for any function f we can do f.x = 5;. This is what we use to store the metadata for stack frames.

Stack frames contain at least the following metadata in the header:

PropertyDescription
f.sizeinteger, number of payload slots in stack frame
f.ainteger, number of register variables used for arguments

example3_cont.size would be 2, since it has two slots of payload. The size of the full example3_cont frame is three slots.

There is also a small number of frames (defined by the runtime system) with the value f.size < 0. These have the stack frame size stored inside the payload. h$ap_gen is another special case, where the stack frame payload encodes both frame size and register information. The runtime system function h$stackFrameSize computes the size of a stack frame:

// compute the size of stack frame h$stack[h$sp] with header f
function h$stackFrameSize(f) {
if(f === h$ap_gen) {
return (h$stack[h$sp - 1] >> 8) + 2;
} else {
var tag = f.size;
if(tag < 0) {
return h$stack[h$sp-1];
} else {
return (tag & 0xff) + 1;
}
}

Here is an example of a stack with an h$ap_gen stack frame, the payload is stored in slots 4 and 5, while slot 6 contains the metadata:

SlotValueFrameType / Description
0h$done1function, "finished" frame
102number, flags
2h$...zireportError2object, exception handler
3h$catch_e2function, exception catch frame
443payload
533payload
65133number, stack frame metadata (payload size and registers)
7h$ap_gen3function, h$ap_gen frame

Exception Handling

Haskell allows exceptions to be thrown within threads and between threads as an alternate way of returning a value. The throw operation transfers control to exception handler in the next catch frame on the stack.

The catch frame has two words of payload:

SlotValueFrameType / Description
001number, flags
1h$...zireportError1object, exception handler
2h$catch_e1function, exception catch frame

The code for h$catch_e is straightforward. It just pops the stack frame and returns to the next frame. This is what happens if no exception has occurred; the program just skips past the exception handler:

function h$catch_e() {
h$sp -= 3;
return h$stack[h$sp];
}

An exception is thrown by the h$throw function, which unwinds the stack. Its implementation in simplified form looks like this:

// throw exception e
function h$throw(e) {
...
while(h$sp > 0) {
f = h$stack[h$sp];
...
// check for stack frames that need to be handled
if(f === h$catch_e) break;
if(f === h$atomically_e) { ... }
if(f === h$catchStm_e) break;
if(f === h$upd_frame) { /* handle black hole */ }
h$sp -= h$stackFrameSize(f);
}
if(h$sp > 0) {
var maskStatus = h$stack[h$p - 2];
var handler = h$stack[h$sp - 1];
// jump to handler
...
} else {
// no exception handler found, report error
...
}
}

h$throw keeps removing stack frames from the stack until some frame of interest is found, using h$stackFrameSize to determine the size of each frame. Eventually, it transfers control to an exception handler or it reports an error if no exception handling frame could be found.

Threads

Concurrent Haskell supports multiple Haskell threads. These Haskell threads are run by one or more system threads. The JavaScript "system" is single-threaded (ignoring Web Workers, which have limited memory sharing), so we have to use a single system thread to run everything. It turns out to be quite straightforward to support multithreading if we use the trampolining calling convention with lightweight stacks. Each Haskell thread gets its own stack and stack pointer.

Each Haskell thread has a thread state object, t of type h$Thread. This object contains the stack (t.stack, Array) and stack pointer (t.sp, number) for the thread, and also keeps track of the thread status, for example whether the thread is finished or masking asynchronous exceptions.

When a thread is created, the h$Thread object is initialized as follows:

/** @constructor */
function h$Thread() {
// get a unique thread id
this.tid = ++h$threadIdN;
this.status = THREAD_RUNNING;

// some initial error handling frame on the stack
this.stack = [h$done
, 0
, h$baseZCGHCziConcziSynczireportError
, h$catch_e
];
this.sp = 3;

this.excep = []; // waiting async exceptions
this.mask = 0; // async exceptions masked
// (0 unmasked, 1: uninterruptible, 2: interruptible)
this.interruptible = false; // currently in an interruptible operation

...
}

The initial stack contains two stack frames. The top three slots contain a catch frame with the h$catch_e header, the h$baseZCGHCziConcziSynczireportError exception handler and 0, for the mask state. The last slot of the stack is for h$done frame, which only has a header and no payload.

Scheduling and Suspending a Thread

Now we have potentially multiple h$Thread objects, each with their own stack. But how do we run the threads?

The trampolining calling convention introduced earlier expects the stack to be stored in h$stack, with h$sp the index of the topmost used slot. This means that if we want to run a thread t, we need to set h$stack and h$sp to the values from the h$Thread object t:

// scheduling a thread t
function __scheduleThread(t) {
h$currentThread = t;
h$stack = t.stack;
h$sp = t.sp;

...
}

We also store t itself in the global h$currentThread variable to keep track of the currently running thread.

While the thread is running, h$stack and h$sp are constantly updated by the compiled Haskell code, so they go out of sync with values saved in the h$Thread object. This means that when we suspend a thread, these need to be copied back into the h$Thread as follows:

// suspending the current thread
function __suspendCurrentThread() {
// save the global h$stack and h$sp back into the thread state object
h$currentThread.stack = h$stack;
h$currentThread.sp = h$sp;

// set h$currentThread to indicate that no thread is running.
h$currentThread = null;

...
}

Having multiple threads, the trampoline loop looks a bit different. A thread returns a special continuation h$reschedule to indicate that another thread may now be scheduled. In simplified form, the scheduler looks as follows:

function scheduler() {
while(true) {
// fetch the next thread to run and schedule it
var t = getNextThread();

// if there are no more threads then we're done
if(t === null) return;

__scheduleThread(t);

// start by executing the continuation at the top of the stack
var c = h$stack[h$sp];

// run until the thread indicates that another may now be scheduled
while(c !== h$reschedule) {
c = c();
}
// suspend the thread again
__suspendCurrentThread();
}
}

In practice, the scheduler is quite a bit more complicated. For example it also uses time-based switching, changing to different thread even when h$reschedule is not returned by the current thread. In that case, the scheduler takes care of saving the thread state onto the stack, using metadata from the continuation.

A discussion of all the ins and outs of the scheduler is beyond the scope of this blog post. But it could be the topic of a follow-up post.

Threads and Asynchronous Exceptions

Haskell exceptions come in two flavours: synchronous exceptions and asynchronous exceptions. Synchronous exceptions always come from the code itself, for example a pattern match failure, a call to error or undefined. Running the code again with the same input always produces the same result.

Asynchronous exceptions come from the outside. They can come from other threads (but don't need to) or from the runtime system. Typical reasons for asynchronous exceptions are timeouts and resource exhaustion. It's perfectly possible for the same function with the same input to be aborted with an asynchronous exception the first run, while running to completion the second time. This means that we must be careful preserving any partially completed computation.

Asynchronous exceptions can happen at any time, which can make them quite tricky to deal with. They could leave the program in an inconsistent state if they occur at the wrong time. Therefore, threads can temporarily block asynchronous exceptions, a process called masking. Different masking states are used for indicating whether exceptions are still masked when the thread is performing an interruptible operation.

Here is a comparison of the main differences between synchronous and asynchronous exceptions:

SynchronousAsynchronous
Thrown using primopraise#, raiseIO#killThread#
Computation on the stackthunks updated to immediately raise the same exception againstack captured in heap objects so the computation can be resumed
Throw to other threadimpossiblepush h$raiseAsync_frame frame onto receiving thread's stack (unless masked)
Throw to current threaduse h$throw immediatelyuse h$throw immediately (no masking)
Maskingno masking, thrown immediatelyexception saved in h$Thread.excep of receiving thread if masked, the sending thread is blocked until the exception is delivered

We can see that while on the surface, synchronous and asynchronous exceptions look similar, there are many differences under the hood. Masking requires some additional machinery for thread synchronization and storing exceptions that cannot be delivered yet.

PropertyDescription
h$Thread.masknumber, mask status indicating unmasked / masked uninterruptible / masked interruptible
h$Thread.excepArray, list of unposted asynchronous exceptions and their posting h$Thread objects. When the receiving thread unmasks, the scheduler posts the exceptions to its h$Thread.stack with h$raiseAsync_frame and unblocks the sending threads
h$Thread.interruptibleboolean, interruptible state of the thread

Important Functions and Values

We have seen various global names and code examples so far, but some example code was simplified a little for clarity. This section offers some pointers to get started with exploring the actual code.

Global Variables

NameDescription
h$stackArray, the stack of the currently running thread
h$spinteger, index of the top slot of the stack
h$r1, h$r2, ..., h$r32Any, "registers", function arguments (first 32)
h$regsArray, function arguments when more than 32 are needed
h$currentThreadh$Thread, object of currently running Haskell thread. null if no Haskell thread is running

Running Threads

These functions start a new thread given an IO action. They also start the main loop of the scheduler.

NameDescription
h$run(c)Run IO action c in a new lightweight thread
h$main(c)Run IO action c in a new lightweight thread, flush buffers and exit the program when finished

Some Useful Internal Functions

NameDescription
h$stackFrameSize(f)Compute the size of the stack frame f
h$throw(e, async)Throw exception e (object), async (boolean) determines whether the exception is asynchronous.
h$mainloopThe entry point for running Haskell code
h$runThreadSliceThe actual trampolining loop (for asynchronous threads)
h$schedulerPicks the next thread to schedule, schedules it and returns the continuation to call
h$suspendCurrentThreadSaves the thread state registers onto the stack to allow another thread to be run, used in conjunction with h$restoreThread
h$rescheduleTells the scheduler to schedule another thread

Some Common Stack Frames

NameDescription
h$doneFinish the current thread
h$returnLoad a value from the stack into h$r1
h$restoreThreadLoad values from stack frame into registers, used to restore the thread state
h$catch_eCatch an exception
h$ap_n_m_eApply a function to m arguments taking n slots on the stack
h$raiseAsync_frameRaise an asynchronous exception received by this thread
h$ap_genGeneric apply a function, the number of arguments is in the stack frame

Conclusion

We have introduced the trampolining calling convention used by the JavaScript backend for GHC and the structure of the stacks used by it.

We have seen that stacks of Haskell lightweight threads are represented by JavaScript arrays with the JavaScript backend. The contents on the stack consist of stack frames with a header and a payload. The header of each stack frame contains some metadata so that code for exception handling can traverse the stack and transfer control to an exception handler.

· 3 min read

Biweekly update from the GHC DevX team at IOG.

Previous updates can be found here.

JavaScript backend

  • Sylvain: fixed the implementation of some thread-related primops (listThreads, getThreadLabel...). See GHC!10303

  • Sylvain: regenerated Cabal lexer with a newer Alex containing his bug fix for the JS backend (cf previous update). Some Cabal tests had to be updated in the process because they were relying on an Alex bug which has been fixed. This raised the question of bug fixes vs the PVP. See Cabal#8896, Alex#227, and PVP#49.

  • Josh: added an implementation for mkdir for node targets. Primarily this is expected to be used by Cabal setup scripts, but any JS backend programs running on node will be able to use it. See GHC!10279.

  • Josh: fixed the -fcheck-prim-bounds flag for the JS backend. The old implementation mostly only caused false negatives in the bounds that were rejected (i.e. it would miss some invalid bounds), but it some cases there were also false positives, caused by casted byte array indexing operations being indexed on the casted type's size. See GHC!10234.

  • Josh: fixed the JS implementation for the access function on files. This function comes from C, but node also provides a direct equivalent - which the new implementation uses. See GHC!10301.

  • Luite: Worked on optimizing finalizers by disabling heap scanning and found an issue affecting debugIO in the base library. This is caused by the code generated for h$appendToHsStringA. A fix is now ready, see GHC!10312

Pipeline refactoring and new IR

  • Jeff: MR!10260 ready to land. This MR paves the way for the new IR, changes the small optimization pass the JavaScript backend had to target the current IR instead of directly printing optimized forms, and adds an IR to IR optimizer. CI shows that the compile time allocations for the JavaScript backend is reduced an average of 3.3% with a maximum reduction 13% with this MR. Note that these reductions come from cleaner code generation, not from the optimizer.

Testsuite and cleanup

  • Sylvain: replaced uses of obsolescent egrep with grep -E. Otherwise recent egrep programs print a warning that makes some GHC golden tests fail. See GHC#22351 and GHC!10308.

Documentation

  • Luite: Finished the blog post on the lightweight stacks and calling convention the JavaScript backend uses. This will be published on the IOG engineering blog shortly.

GHC Proposal #0134 (Deprecated Exports)

  • Bartek: Implemented the parsing and pretty printing of warnings of deprecated exports. See GHC!10283 for the partial implementation of the proposal so far.

Performance

  • Jeff : First case study on performance engineering for the Haskell Optimization Handbook is almost done. The case study demonstrates a first pass of performance engineering on the klister programming language's interpreter. The chapter demonstrates the use of ticky ticky, info-table, biography and retainer analysis profiling to gain a 6-fold improvement in the interpreter.

  • Josh: Updated the unboxed codebuffers implementation to include pattern synonyms for backwards-compatability. Now, the implementation of handle encoding in base can be updated to be more performant without requiring external changes. See GHC!9948.

· 3 min read

Biweekly update from the GHC DevX team at IOG.

Previous updates can be found here.

JavaScript backend

Sylvain: fixed build of GHC with the quickest Hadrian flavour. It was failing due to a incorrect coercion in ghc-heap. See #23181 and !10192

Jeff: Began to implement the plan to add GHCJS's optimizer to the JavaScript backend. This item is slated for GHC 9.8 and is on our roadmap. The first step, which is now finished, is to split the intermediate representation the backend uses to isolate all code generation. This split then allows us to change the existing intermediate representation for the RTS and GC, remove some cruft that was adopted from GHCJS, and add optimization passes.

Jeff: Tested generating let instead of var to take advantage of more recent JavaScript standards than GHCJS comported to. However, this change produced such significant runtime regressions in the generated JavaScript that we abandoned it.

Sylvain: fixed an out-of-bound array access in code generated by Alex, the lexer generator, see Alex#223. The out-of-bound access only triggers an exception with the JS backend; with native backends it only causes a benign data corruption. This was found in Cabal-syntax's lexer, which still needs to be regenerated, see Cabal#8892.

Josh: brought the callbacks CLC ticket to vote. Now it has enough votes to be passed, so we're just awaiting a formal conclusion. Following this, the JavaScript backend will be able to pass Haskell functions to foreign imports as JavaScript callbacks, which in turn enables JavaScript to call into Haskell code.

Merge request: GHC MR 10128

Josh: debugged an issue with the -fcheck-prim-bounds flag. The errors were mostly false negatives in the indicies that were rejected, but there were also a false positive due to the existing code not accounting for zero-size ranges being allowed in range-based operations (even at indicies that don't exist, such as negative).

Merge request: GHC MR 10234

Luite: Worked on using the JavaScript finalization functionality for Haskell weak references. It looks like it's not possible to keep the existing "GHC style" reachability semantics from System.Mem.Weak. Our new approach is to create a "JavaScript style" variant in System.Mem.Weak.JS that avoids the need for expensive heap scanning. Both variants of weak references can be supported by the JavaScript backend, and heap scanning can be avoided if there are no pending finalizers for "GHC style" weak references (in progress).

Compiler Performance

Jeff: Finally landed MR!9702 which refactored GHC's driver to use more eficient data structures. This refactor reduced allocations for every test in GHC's testsuite, with an geometric mean of -1.6%.

Team

The DevX team at IOG welcomes our summer intern Bartek! Bartek will be working on GHC proposal 134; deprecating exports, 516; adding warnings for incomplete record selectors, and deprecating instances. We are glad to be working with him and happy to have him on the team.

GHC migration and stability

Bartek: Started this sprint and has begun to read through the front end code of GHC to begin working on one of the ghc proposal tickets next week.

Miscellaneous

Sylvain: Fixed linker warning issue with -fproc-alignment=64. See #20758 and !10200.

· 3 min read

Biweekly update from the GHC DevX team at IOG.

Previous updates can be found here.

JavaScript backend

Port of GHCJS's Callback feature

Josh: opened an MR for porting GHCJS's GHCJS.Foreign.Callback module into the JavaScript backend. The Callback type allows for passing Haskell functions into the FFI using standard JavaScript-styled function arguments - where compiled functions usually use global variables as registers to pass arguments. By passing functions into FFI imports and storing references to the functions, we can enable a form of FFI "exports" - since the passed functions allow JavaScript code to call back into Haskell code.

This MR additionally adds user guide documentation for both callbacks, and general JavaScript FFI usage.

We're currently awaiting the results of a CLC proposal before the merge can be completed.

See the following links for more information:

Pipeline refactoring and new IR

Jeff: Began to lay the foundation for splitting the DSL in the JavaScript backend by segregating code generation to a new, basically identical DSL. The motivation is that by splitting the existing DSL, we can now replace the old DSL while still providing working builds. This is the first step in a multistep plan that eventually ends with GHCJS's optimizer and a typed sunroof based DSL. MR is up

Faster weak references implementation?

Luite: investigated replacing heap traversal with newer JS feature.

ES2021 has new functionality for weak references and finalization. These do not directly map to Haskell weak references and finalizers, but it's probably possible to use them to avoid a lot of the expensive heap scanning that we currently do.

It's not yet clear whether we can completely remove the heap scanning while exactly preserving the current semantics for weak references.

Testsuite and cleanup

Sylvain: removed dead code in the JS RTS !10102

Sylvain: did some triage of tests in GHC's testsuite failing with the JS backend. See tickets #22370, #22374, #22576, and merge requests !10148, !10150.

Code generator and performance

Jeff: tested changing the code generator to generate let instead of var. This led to a large performance regression (~20%), which was not isolated to any single function (via a ticky profile in node). The working hypothesis is that let requires more work when allocating closures because the JavaScript engine needs to ensure all variables are lexically scoped. We have not confirmed that this is the cause yet, but we did find that we generate closures that are allocated at runtime in the code generated for base. So after review we decided to leave the vars and not generate lets. Unless we begin to observe issues around scoping, using the safer construct seems to be too much of a performance hit.

Miscellaneous

Jeff: Revived Data structure work in GHC.Unit.State resulting in -1.7% reduction in allocations (by geometric average) and -9% in some cases. MR is here and ready to land, just needs to upstream a single-line patch to haddock.

· 2 min read

Biweekly update from the GHC DevX team at IOG.

Previous updates can be found here.

JavaScript backend

Luite: !10059 Implemented computing a valid LambdaFormInfo for the JavaScript backend so that interface files can be written without warnings. Fixes #23053.

Template Haskell

Luite: !10008 implemented keeping track of subdirectories in GHC's temporary file manager. It has been updated after reviews and is now ready to be merged.

JavaScript EDSL

Jeff: MR is open. In response to feedback from the team Jeff has added (and removed) several features that distinguish the eDSL from sunroof. These include:

  • removing threading and continuations
  • adding named unique variables and a proper switch statement
  • removing a dependency on the operational package
  • removing a dependency on the data-reify package
  • added a compilation function that compiles the eDSL to the IR the JavaScript backend uses.

These changes allow more of the RTS to be replaced in the eDSL and allow the RTS to be typed using Haskell's type system. For example, now the STG registers track the type of the values they hold. The RTS migration is now underway.

copyMutableArray primops

Sylvain: fixed JS implementations for copyMutableByteArray# and copySmallMutableArray primops. They were wrong in some cases when the source and target arrays overlap. See #23033 and !10037.

Testsuite

Sylvain: made a minor cleanup in the testsuite to correctly tag tests requiring Cmm support (which the JS backend doesn't have): !10043.

Callbacks

Josh: ported GHCJS's GHCJS.Foreign.Callback module. Manual testing revealed some minor changes required in the JavaScript backend, including to base.GHC.JS.Prim, to make everything work as expected.

This will enable passing Haskell functions into foreign imports, which in turn enables a form of calling into Haskell from JavaScript code.


Compiler performance

Optimization Handbook

Jeff: Work has begun on the first major case study: a performance regression in the sbv library.


RTS linker

Sylvain: helped fixing a RTS linker bug. The RTS ELF linker didn't properly take into account required section alignments and always used 16-bytes alignment. However AVX instructions generated by the C compiler may expect 32-bytes alignment (even if the code uses unaligned load intrinsics, the C compiler may optimize them into aligned load instructions, as was the case here). The fix was simple, the test case a bit more involved. See #23066 and !10087.

· 3 min read

Biweekly update from the GHC DevX team at IOG.

Previous updates can be found here.

JavaScript backend

Template Haskell

Sylvain: !9779 adding TH support for the JS backend passes CI and is ready for reviews.

Ticket #23013 has been opened to keep track of an issue with the recompilation avoidance mechanism. Fixing the issue seems to require some invasive refactoring that is best left for a future merge request.

Luite: !10008 Implemented keeping track of subdirectories in GHC's temporary file manager. Ready for review. Fixes #22952.

Temporary subdirectories are used when linking Template Haskell with the JavaScript backend and also in some situations when linking with other backends. This would always result in files being left behind in GHC's temporary directory (and a warning at high enough verbosity settings) since these subdirectories were never removed. With this patch, GHC keeps track of all created subdirectories and removes them at the end of the session.

JavaScript RTS refactor

Josh: has merged the refactor of the RTS generation module to reduce redundant code. In this refactor, debug logging was also completed to determine the correct numbers to use as the cache sizes for generated JavaScript names - allowing us to vastly reduce the size of these arrays for efficiency, and to remove panic cases in favour of generating higher numbered names without caching.

!9794

Integer performance

Sylvain: !9825 has been updated after an helpful review by Matthew Claven.

JavaScript EDSL

Jeff: MR is open. In response to feedback from the team Jeff has added (and removed) several features that distinguish the eDSL from sunroof. These include: removing threading and continuations, adding named unique variables and a proper switch statement. These changes allow more of the RTS to be replaced in the eDSL and allow the RTS to be typed using Haskell's type system. For example, now the STG registers track the type of the values they hold.

CI

Sylvain: fixed CI script test that prevented performance results to be stored for the JS backend (see #22923 and !10026).


Compiler performance

More-strict break/span

Josh: opened a CLC (Core Libraries Committee) proposal to add stricter versions of break and span to Data.List in addition to the existing lazy versions. The proposal considers evidence that these versions are situationally more performant, by comparing allocation statistics, generated STG, and microbenchmarks - as well as making the argument for consistency with existing List functions that also have strict versions.

See also:

Unboxed CodeBuffers

Josh: opened a CLC proposal to modify the implementation of CodeBuffers in base to use unboxed tuples in the return type of encoding functions. This change presents a significant allocation improvement, due to the difficulty GHC has with applying a certain optimisation within data types.

See also:

Optimization Handbook

Jeff: Has been hard at work on the Optimization Handbook. He has finished a chapter on Lambda Lifting, significantly expanded the glossary, and added documentation to the sphinx-exec-directive haskell extension that he finished last month. The optimization handbook is now in review by IOG's IT team to migrate it to the Haskell Foundation website.


GHCi

Luite: Added a testcase and did some cleanups in the types of the C code in MR !9957, and adjusted the bytecode generator to not produce zero offset SLIDE instructions anymore. This is now ready for review.

· 6 min read

Biweekly update from the GHC DevX team at IOG.

Previous updates can be found here.

JavaScript backend

Template Haskell

Luite: fixed the support for one-shot mode (GHC's -c command-line flag) in the TH JS linker.

Luite: Investigated a warning about the temporary directory not being removed after running Template Haskell with the JavaScript backend. It turned out that GHC's GHC.Utils.TmpFs.newTempDir, which is used by the Template Haskell linker, does not allow the newly created directory to be removed (see #22952).

Sylvain: cleaned up the merge request, removing unnecessary changes and adding documentation.

JavaScript backend CI

Jeff: JavaScript backend CI was finally merged! Now that we have CI we are unblocked on several fronts, such as, implementing faster arithmetic and fixing some async exceptions. In general, we can now have confidence that our work is progressing the JavaScript backend to a better state.

Sylvain: fixed a spurious failure on JS CI due to some test passing on fast runners while it was expected to fail (see !9934).

Josh: fixed some inaccurate test predicates !9939

FileStat

Josh: rebased and merged !9755. This patch changes the representation of the JavaScript equivalent of the C struct stat to make its field offsets match the C ones: some Haskell codes directly access fields of this structure using hsc2hs to get the field offsets from the C headers.

This patch also adds fields to the JavaScript file stat that were previously not included, such as modification and access times.

JavaScript RTS refactor

Josh: rebased !9794 which consists in the refactor of the module generating some part of the RTS. The new JS CI job found a bug in the patch that caused ~50 tests to time out, so waiting for CI to be set up before merging this MR was judicious.

This MR also became an opportunity to revisit some arbitrary cache sizes in the RTS code generator. This is still ongoing work.

Warnings

Luite: We accidently removed the check for the "javascript" calling convention on foreign imports, allowing this convention to be wrongly used on native platform. Fixed in !9880.

Fix for asynchronous exceptions

Luite: Fixed an issue in the garbage collector for the JavaScript backend: A thread that posts an asynchronous exception (throwTo) to another thread is temporarily suspended until the exception has been delivered. The garbage collector did not correctly follow the list of threads suspended in this way, potentially considering them unreachable and cleaning up data referenced by them. See #22836 and !9879.

Integer performance

Evaluating the following expression is very slow in general but especially with the JS backend:

1 `shiftL` (1 `shiftL` 20) :: Integer

We've had to mark a test computing this as broken on CI because it triggers a timeout error. Luckily the identification of slow operations is easy with JavaScript profiling tools (see graphs in https://gitlab.haskell.org/ghc/ghc/-/issues/22835) and we know that Word32 primops are the culprit in this case.

Sylvain started replacing the uses of JavaScript's BigInt in the implementation of these primops with usual JavaScript numbers. See !9825.

JavaScript EDSL

Jeff: JavaScript eDSL based on sunroof close to MR, see #22736 for background. Compiler is complete. Major items left are: the interpreter to translate to JStat, filling in documentation, and testing now that CI has been merged.

Documentation

Jeff: Wrote the JavaScript backend release notes. Notes pending approval. We cite the JavaScript backend wiki page in the release notes. So Jeff and Sylvain heavily edited the wiki pages to make them suitable for external customers.

Change from js to javascript architecture

In #22740 it was noticed that hackage-server would prevent the upload of the upcoming base package bundled with GHC 9.6. The reason is that hackage-server relies on the cabal --check feature which filters out perfectly valid packages (it happened before, for example with ghc-api-compat).

In our case, the package was rejected because the js architecture wasn't recognized as a built-in one, but luckily we could fall back to the existing javascript built-in architecture defined for GHCJS (if it wasn't a JS backend, we would have had to fix Cabal, update hackage-server dependencies, and redeploy hackage-server...).

Sylvain completed Ben's MR in !9814.

For early users of the JS backend the change means that from now on:

  • configure command is: emconfigure ./configure --target=javascript-unknown-ghcjs
  • Cabal condition is: arch(javascript)
  • CPP condition is: #if defined(javascript_HOST_ARCH)

Compiler performance

More-strict break

Josh: opened merge request !9868 to add a stricter break' version to base - however, this would require a CLC proposal.

Further analysis was done using ticky profiles on a simple test program that benchmarks GHC's startup code. This has found an example where a more-strict break is an improvement in GHC, which will provide motivation for the CLC proposal.

Unboxed CodeBuffers

Josh: implemented changes to GHC's text encoding buffers to use unboxed tuples on handle encoders/decoders. The buffers pass around and repeatedly pack/unpack a tuple in an IO inner loop, which causes a significant number of unnecessary allocations. By replacing this with an unboxed tuple, and replacing the IO with manually passing around a State# RealWorld in the same tuple, we're able to reduce allocations by nearly 50% in a pathological example (non-allocating loop printing characters). See #22946 and !9948.

Optimization Handbook

Jeff: opened IT ticket to move the Optimization Handbook to Haskell Foundation's repository. IT stated they need to check with legal, of course.

Jeff: almost finished the lambda lifting chapter; major remaining items are adding some glossary terms and describing the interaction between lambda lifting and calling conventions.

Constant folding for division operations

While looking into his old merge requests still opened, Sylvain nerd-snipped himself into fixing constant folding rules for division operations (see #22152).

MR !8956 adds the following rewrite rules:

   case quotRemInt# x y of
(# q, _ #) -> body
====>
case quotInt# x y of
q -> body


case quotRemInt# x y of
(# _, r #) -> body
====>
case remInt# x y of
r -> body


For all primitive numerical types:

(x `quot` l1) `quot` l2
| l1 /= 0
| l2 /= 0
| l1*l2 doesn't overflow/underflow
====>
x `quot` (l1 * l2)

It also makes some division primops (Word64/Int64 Quot/Rem, WordQuotRem2Op) ok-for-speculation when the divisor is known to be non-zero, similarly to other division primops. Otherwise the last rule wasn't firing in the added test because we got the following Core (simplified for the presentation):

case quotWord64# x# 10#Word64 of
ds1 -> case quotWord64# ds1 20#Word64 of
ds2 -> ...

and not:

case quotWord64# (quotWord64# x# 10#Word64) 20#Word64 of
ds2 -> ...

GHCi

Luite: A test run of GHC 9.6 with the -fbyte-code-and-object-code flag on head.hackage revealed issue #22888 with bytecode size limits. Many bytecode instructions have Word16 operands, which is not always enough to run programs generated from optimized core. The solution is to enable large operands for all the bytecode instructions that deal with stack offsets. See #22888 and !9957.

· 7 min read

This is the second biweekly update of the IOG GHC DevX team. You can find the previous one here.

JavaScript backend

Template Haskell

Sylvain continued his work on the implementation of Template Haskell for the JS backend. He factorized the code from iserv and libiserv into the ghci library. This makes it easy for GHC to load and run the external interpreter server (iserv) that ends up compiled into JavaScript in a NodeJS instance. He modified GHC to avoid creating ByteCode objects (which are unsupported by the JS backend) and to instead compile and link JavaScript code.

Template Haskell basically works with the JavaScript backend now, except for a few corner cases (such as one-shot mode), but these should be fixed in the coming days/weeks.

Luite modified Sylvain's JavaScript code to fix support for Darwin and Windows. If you want to test it, a draft merge request has been opened: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/9779

JavaScript backend in the browser tutorial

Josh published a tutorial about using code produced by the JavaScript backend in a web page: https://engineering.iog.io/2023-01-24-javascript-browser-tutorial

Cabal support for js-sources

Sylvain added tests to his patch that adds cabal support for the js-sources stanza when GHC is used as a compiler (and not only when GHCJS is used as a compiler), allowing the patch to be merged: https://github.com/haskell/cabal/pull/8636

https://github.com/haskell/cabal/issues/8639 is still open though so be careful if you try to use js-sources, they still don't work in some cases.

JavaScript backend CI

The JavaScript backend CI has been an ongoing saga for the last month, and has been a blocking item for JavaScript Backend development. Thankfully it is close to being merged. This week, Jeff rebased the CI to discover that recent changes removed nodejs (the node that is bundled with emscripten) from the CI containers $PATH. So Jeff patched the CI images to add node. Now the CI runs and has discovered two new bugs even before being merged. All that is left is to bump some submodules and the CI will be ready to land in GHC HEAD.

FileStat

Josh opened an MR to match the layout of the JavaScript fileStat with the layout of the equivalent struct defined in Emscripten's stat.h. This is needed to ensure that hsc2hs features work correctly with this data type. Hsc2hs features can peek at memory locations directly without using accessor functions, and the memory locations are taken from the header file, hence the requirement to match these layouts.

This MR only touches JavaScript files, so we're waiting on the approval of the JS CI before continuing. For more information, see https://gitlab.haskell.org/ghc/ghc/-/issues/22573

JavaScript RTS refactor

Josh refactored parts of the GHC.StgtoJS.Rts.Rts module to remove special cases from one of the n-argument JavaScript RTS functions, and combined these cases into a general case. Thus, simplifying the Rts module's code.

Josh also improved the caching in the JavaScript Backend for commonly used names in the generated JavaScript ASTs. Previously, names such as x1 would require allocation for each use: first by allocating a String, which was then converted to a GHC FastString, which was finally wrapped in a JavaScript AST data constructor. Now, these names are captured in a static CAF'd Array and each reference was replaced with a lookup to the corresponding slot in the array. This avoids the extra allocations and ensures these names are shared.

For the full set of refactors, see: https://gitlab.haskell.org/ghc/ghc/-/issues/22822

JavaScript EDSL

Jeff began work on a new eDSL to replace the existing DSL the JavaScript Backend inherited from GHCJS. This solves a design problem. The existing DSL in the JavaScript Backend is used for two things: (1) to write the JavaScript Backend's garbage collector, runtime system and other low level bits; (2) as a target for optimizations; (3) as the source for code generation. This becomes problematic because the existing DSL tries to do so much that it ends up not being particularly good at (1), (2) and (3).

The fix is to separate concerns by writing a new DSL for (1). The DSL is Type Safe and based on the Sunroof compiler (Thanks Andy Gill et al. for your labor!). Then, we'll compile the new DSL to the existing GHCJS DSL. This way we can slowly begin to replace JavaScript Backend code module by module, thus gaining type safety while still continuing other work. The end game of this project is to eventually remove the GHCJS DSL entirely and then compile our new DSL to a better intermediate representation that is explicitly crafted to make optimizations easier.

Blog posts

Luite has been working on new blog posts about internals of the GHC JavaScript backend and a strategy guide for debugging the generated JavaScript code. These will be published in the coming weeks.

JavaScript backend configuration issue in a Docker image

Sylvain debugged a configuration issue of GHC with the JavaScript backend (see #22814). The recommended way to configure is to use the following command line:

emconfigure ./configure --target=js-unknown-ghcjs

where emconfigure is provided by the Emscripten project and sets appropriate environment variables (CC, LD, AR...).

However in some cases it seems like these variables are set as follows:

CC=emcc
LD=emcc
...

in which case GHC's configure script will silently ignores them... and uses the C compiler for the host platform instead (x86-64, aarch64...). As the C compiler is only used for the CPP pass, it results in some inscrutable errors. In #22814 the error is due to CSize being inferred as a 64-bit type while it should be 32-bit for the JavaScript platform, leading to CSize values being passed as 2 arguments in FFI calls while the callee expects 1.

Calling configure with the right environment variables fixes the issue:

./configure CC=$(which emcc) LD=$(which emcc) --target=js-unknown-ghcjs

Discussion about JavaScript backend maturity

Quite some time was spent discussing users' expectations about the JavaScript and WASM backends. We would like to make it very clear that even if GHCJS has been here for a long time, the JavaScript backend doesn't yet have the same level of maturity.

Bugs, missing features, and sub-par performance are to be expected in the 9.6 release. We encourage adventurous users to try out this release and send us feedback, but it's best to exercise caution before relying on it for production.

Compiler performance

More-strict break

Josh did more investigation into the performance difference that introducing some strictness into the break function would make. The STG and microbenchmarks are very promising, but using the "compile cabal" benchmark, there doesn't seem to be a noticable time difference caused by the change. In terms of memory, it seems to reduce GC copying, but slightly increase overall allocations and total memory usage.

There's pathological cases in using a strict break by default - for example in the lines function. Because of this, it's likely that this optimization would have the most benefit if applied in isolated cases in GHC, if any pathological lazy cases are found.

Misc

Cross-compilation from Linux/Darwin to Windows

Ticket #22805 reminded Sylvain that he had made MR !9310 more than two months ago to fix the same issue: cross-compilation from Linux/Darwin to Windows. The MR has now been updated, tested, reviewed, and merged.

Hadrian rules to build the Sphinx-based docs

Sylvain started working on adding a chapter about the JavaScript in GHC's Users Guide. The first step was to fix Hadrian's build rules for the Users Guide (MR !9795)

· 9 min read

In a previous post we introduced GHC's new JavaScript backend, which allows the compilation of Haskell code into JavaScript. This is the first tutorial in a new series about the JavaScript backend. In this post, we'll build GHC as a JavaScript cross-compiler and run a trivial Haskell program in the browser.

We plan to write more of those blog post in the coming weeks and months as we add new features (e.g. support for "foreign exports" that will allow JavaScript code to call into Haskell code, support for Template Haskell, etc.). For now it relies on our "insider" knowledge (e.g. how the FFI works) that isn't well documented elsewhere. We do plan to add a chapter about the JavaScript backend in GHC's user guide, but for now your best chance is to look at GHCJS's documentation or at the source code.

Please note: this is a technology preview of the in-development JavaScript backend for GHC. Not all Haskell features are implemented, and bugs are expected. It is currently rather complicated for JavaScript code to call into Haskell code ("foreign exports" aren't implemented). GHC isn't a multi-target compiler yet, so a GHC executable built for a native platform (Linux/x86-64, Windows/x86-64, Darwin/AArch64...) as currently distributed (via ghcup, Stack, binary distributions, etc.) won't be able to produce JavaScript. Official prebuilt binary distributions are likely to remain unavailable until GHC gains multi-target support - requiring the JavaScript backend to be built from source even after the backend matures. That's why we start this post with the required steps to build yourself a GHC compiler capable of producing JavaScript.

Building GHC as a Cross Compiler to JavaScript

Installing Dependencies

First we need to install all the typical dependencies for GHC plus Emscripten, so our final list is:

  • GHC version 9.2 or later
  • Cabal
  • Alex
  • Happy
  • Emscripten to configure with
  • (Optional) NodeJS to run JavaScript locally

Let's take these in order, a standard GHC distribution with Cabal is needed so we can boot our new compiler. We recommend using GHCUP (https://www.haskell.org/ghcup/install/), or your system's package manager to install the this.

We need Alex and Happy to build GHC, these can be installed through Cabal:

cabal install alex happy -j

We need Emscripten during the configure step of the build. Emscripten should be available in most package managers, but you can also build and install it from source:

git clone https://github.com/emscripten-core/emsdk.git
cd emsdk
./emsdk install latest
./emsdk activate latest
source ./emsdk_env.sh

After installing Emscripten, emconfigure should be available on your system path. Use which emconfigure to check that it is on your $PATH. If you built from source, then the output should point to a location within the emsdk git project like so:

$ which emconfigure
/path/to/emsdk/upstream/emscripten/emconfigure

For more detailed installation instructions, see https://emscripten.org/docs/getting_started/downloads.html.

That's all we need to build GHC as a cross compiler. NodeJS can be installed via your system's package manager if you want to run the JavaScript programs locally. We'll assume it's in your $PATH for the rest of the blog post.

Building GHC

With all the dependencies installed, we can clone GHC HEAD and build the cross compiler:

git clone https://gitlab.haskell.org/ghc/ghc.git --recursive

You should notice quite a few submodules being cloned as well as the main repo; expect this to take a while. Once this has completed, navigate to the ghc directory and run the following configuration commands:

cd ghc
./boot
emconfigure ./configure --target=javascript-unknown-ghcjs

emconfigure ./configure --target=javascript-unknown-ghcjs will finish by outputting a screen that looks like:

----------------------------------------------------------------------
Configure completed successfully.

Building GHC version : 9.5.20221219
Git commit id : 761c1f49f55afc9a9f290fafb48885c2033069ed

Build platform : x86_64-unknown-linux
Host platform : x86_64-unknown-linux
Target platform : javascript-unknown-ghcjs

Bootstrapping using : /home/josh/.ghcup/bin/ghc
which is version : 9.4.2
with threaded RTS? : YES

Using (for bootstrapping) : gcc
Using clang : /home/josh/emsdk/upstream/emscripten/emcc
which is version : 15.0.0
linker options :
Building a cross compiler : YES
Unregisterised : NO
TablesNextToCode : YES
Build GMP in tree : NO
hs-cpp : /home/josh/emsdk/upstream/emscripten/emcc
hs-cpp-flags : -E -undef -traditional -Wno-invalid-pp-token -Wno-unicode -Wno-trigraphs
ar : /home/josh/emsdk/upstream/emscripten/emar
ld : /home/josh/emsdk/upstream/emscripten/emcc
nm : /home/josh/emsdk/upstream/bin/llvm-nm
objdump : /usr/bin/objdump
ranlib : /home/josh/emsdk/upstream/emscripten/emranlib
otool : otool
install_name_tool : install_name_tool
windres :
dllwrap :
genlib :
Happy : /home/josh/.cabal/bin/happy (1.20.0)
Alex : /home/josh/.cabal/bin/alex (3.2.7.1)
sphinx-build :
xelatex :
makeinfo :
git : /usr/bin/git
cabal-install : /home/josh/.cabal/bin/cabal

Using LLVM tools
clang : clang
llc : llc-14
opt : opt-14

HsColour was not found; documentation will not contain source links

Tools to build Sphinx HTML documentation available: NO
Tools to build Sphinx PDF documentation available: NO
Tools to build Sphinx INFO documentation available: NO
----------------------------------------------------------------------

If everything is correct, you'll see that the Target platform is set to javascript-unknown-ghcjs, and the build tools will be set to their Emscripten counterparts: ar becomes emar, nm becomes llvm-nm, etc.

Finally, to build GHC:

./hadrian/build --bignum=native -j --docs=none

Expect this to take around a half hour or longer. If all goes well you should see:

/--------------------------------------------------------\
| Successfully built library 'ghc' (Stage1, way p). |
| Library: _build/stage1/compiler/build/libHSghc-9.5_p.a |
| Library synopsis: The GHC API. |
\--------------------------------------------------------/
| Copy package 'ghc'
# cabal-copy (for _build/stage1/lib/package.conf.d/ghc-9.5.conf)
| Run GhcPkg Recache Stage1: none => none
| Copy file: _build/stage0/bin/javascript-unknown-ghcjs-ghc => _build/stage1/bin/javascript-unknown-ghcjs-ghc
Build completed in 1h00m

Take note of _build/stage1/bin/javascript-unknown-ghcjs-ghc path. This is the GHC executable that we'll use to compile to JavaScript. To make life easier on ourselves we can alias it:

alias ghc-js=`pwd`/_build/stage1/bin/javascript-unknown-ghcjs-ghc

First Haskell to JavaScript Program

Now that we have a version of GHC that can output JavaScript, let's compile a Haskell program and run it with NodeJS. Make a file named "HelloJS.hs", with the following contents:

-- HelloJS.hs
module Main where

main :: IO ()
main = putStrLn "Hello, JavaScript!"

Now we can compile it using the alias we defined earlier:

ghc-js HelloJS.hs

You should see the following output, and a HelloJS executable.

[1 of 2] Compiling Main             ( HelloJS.hs, HelloJS.o )
[2 of 2] Linking HelloJS.jsexe

If you have NodeJS is on your Path, then this executable can be run just like any other command line program:

./HelloJS
Hello, JavaScript!

Notice that a directory called HelloJS.jsexe was created. This directory contains all the final JavaScript code, including a file named all.js, and a minimal index.html HTML file that wraps all.js. For now, we'll only care about all.js and return to index.html later. all.js is the payload of our HelloJS exectuable. The executable is simply a copy of all.js, with a call to node added to the top. We could have equivalently run our program with:

node HelloJS.jsexe/all.js

Haskell in the Browser

We saw in the previous example that GHC's JavaScript backend allows us to write Haskell and run the output JavaScript with NodeJS. This produces a portable executable, but otherwise doesn't enable anything we couldn't do before; GHC can already compile Haskell to run on most platforms! So let's do something novel, and run Haskell in the browser.

In this example, we'll use Haskell to draw a simple SVG circle to our browser window. Put the following code in a file named HelloBrowser.hs:

-- HelloBrowser.hs
module Main where

import Foreign.C.String

foreign import javascript "((arr,offset) => document.body.innerHTML = h$decodeUtf8z(arr,offset))"
setInnerHtml :: CString -> IO ()

circle :: String
circle = "<svg width=300 height=300><circle cx=50% cy=50% r=50%></circle></svg>"

main :: IO ()
main = withCString circle setInnerHtml

Notice that we've encountered a Haskell feature that's only available in the JavaScript backend: JavaScript foreign imports. This feature allows our Haskell program to call JavaScript functions. In our example we use this feature to call a JavaScript arrow function that updates the body of the page with our HTML snippet containing a drawing of a circle. Alternatively, we could have set the foreign import to a function symbol like so:

foreign import javascript "setInnerHTML"
setInnerHtml :: CString -> IO ()

where setInnerHTML is defined in a .js file that is then loaded by passing the JavaScript file to GHC along with the Haskell sources.

Next, we can compile our program to JavaScript, again with our built GHC:

ghc-js HelloBrowser.hs

Or ghc-js HelloBrowser.hs foo.js if setInnerHTML is defined in foo.js.

Recall the index.html file inside the HelloBrowser.jsexe directory. This HTML file has our compiled JavaScript already included, so if you open it in your browser, you'll find it loads our SVG circle in the top-left of the page!

Example webpage screenshot

index.html contains the minimal HTML code required to load the generated JavaScript code. It simply loads the all.js file mentioned above with the following script tag that you can reuse in your own HTML files:

<script language="javascript" src="all.js" defer></script>

As the JS backend still lacks support for some FFI features (foreign exports, foreign "wrapper" imports...), JavaScript codes can't easily interact with Haskell codes. It reduces the amount of advanced/interesting examples we can present for now. We'll publish new blog posts illustrating these features when they will be implemented.

Conclusion

In this post, we've seen how to build a first Haskell program to run in the browser using a preview of GHC's in-development JavaScript backend. This program used "foreign imports" to make a JavaScript function available within the Haskell code, which allows a limited interaction between Haskell and the browser. We also saw the structure of the outputs of the JavaScript backend, in the .jsexe directory, and how this allows our Haskell program to be invoked by a custom HTML wrapper. This was all enabled by building a version of GHC from source, with the build process having been configured with Emscripten to produce a GHC exectuable that targets JavaScript.

· 3 min read

Starting in 2023 we–the IOG GHC DevX team–are going to provide biweekly updates about our work. This is the first edition.

JS backend

JS backend in the browser tutorial

We are working on a draft of a JS backend tutorial about using it to build a simple web application: https://github.com/input-output-hk/engineering/pull/24

Publication is expected next week.

Cabal support for js-sources

Sylvain made a patch to add cabal support for the js-sources stanza when GHC is used as a compiler (and not only when GHCJS is used as a compiler): https://github.com/haskell/cabal/pull/8636

It’s missing tests and then it should be ready to be merged.

JS backend CI

Jeff worked on adding a proper CI job that runs the full testsuite with the JS backend. Cf ticket #22128 and merge request !9552.

He had to fix some unexpected test passes (!) with the JS backend due to an imprecise req_smp predicate used by the testsuite. More details on #22630 and !9568.

JS backend: Template Haskell

Luite and Sylvain started implementing support for Template Haskell (TH) with the JS backend.

Sylvain reimplemented support for running an adapted version of the THRunner.js script from GHCJS. He also refactored the JS linker and implemented incremental linking.

The next step is to link and to run an instance of the external interpreter code that implements the Template Haskell protocol (execution of the Q monad) adapted to run in JavaScript. GHCJS used to have its own duplicated code for this but for maintenance concerns it’s much better to reuse the external interpreter code.

GHCi

GHCi: sized primitive types (Word8#, etc.)

Luite implemented support for sized primitive types in GHCi. Cf !8822.

GHCi: “prim” calling convention

Luite implemented support for the “prim” calling convention in GHCi. Cf !9026

Compiler performance

Jeff

Each of these improves allocations between 0.2 - 0.7% depending on the input (improvements by a thousand cuts):

  • GHC.Foreign improved Strictness: An Attempt to remove lazy IO and SAT an argument that is only used for a debug message. Got a review from Andreas. Want to try to 2 more improvements then ready to merge. !9644

  • InfoTableProv: ShortText → ShortByteString: Post review from Ben I made some improvements that preserved type safety and still recovered most of the performance improvements. Ready to merge !9637

  • GHC.Unit.State: swapped use of Data.Map for GHC.Unique.UniqMap and expanded UniqMap API. Results in progress (need to patch Haddock) and still experimental. The idea here is to use a data structure that no longer needs to balance on insertions because Unit.State performs a lot of merges on these maps.

  • GHC.Utils.Binary.foldGet’ removed lazy IO and lazy accumulator: merged !9538

Josh

  • Stricter break: we noticed in a ticky profile that GHC.List.break allocates 3 thunks and 1 datacon per list element returned the first part of the list. If this list is fully evaluated later, we can allocate only 1 datacon per list element instead. Preliminary results bootstrapping GHC with this change look very promising.

  • FastMutInt (Binary): Josh started reviving Sylvain’s MR !7246 about bundling more than one Int# in a FastMutInt for performance. He tried to make a proof of concept generalisation of 2-FastMutInt into n-FastMutInts (using GHC type level Natural). The types don’t really recurse in a convenient way (Int -> (… -> IO ())) so it would probably introduce more complexity than the problem is worth. Now, he’s just implementing the original patch with the fixes and documentation.

· 20 min read

A new JavaScript backend was merged into GHC on November 30th, 2022! This means that the next release of GHC will be able to emit code that runs in web browsers without requiring any extra tools, enabling Haskell for both front-end and back-end web applications.

In this post, we, the GHC DevX team at IOG, describe the challenges we faced bringing GHCJS to GHC, how we overcame those challenges, and what's left to do. This post is rather long so we've provided these links in case you would like to skip ahead:

Take me to the future of GHCJS
Tell me what to expect
Show me the product roadmap
Tell me how I can help
Just show me how to hello world! (Skip to build instructions)

Why JavaScript? Or, the Big Picture.

To put it simply, the number of users on the internet is as low as it will ever be right now, and it is almost guaranteed that those users use JavaScript. At time of writing, JavaScript holds 97.3% of client-side programming market share (not to mention market share of front-end technologies). Furthermore, JavaScript is not going to disappear anytime soon. As more and more interactivity is pushed onto the internet, JavaScript will become more entrenched because of backwards compatibility, network effects and the amount of capital already devoted to it. JavaScript, like C and COBOL will be with us for the foreseeable future. This makes JavaScript an attractive target; it provides portability, allows us to capitalize on the massive investments in the language and platform, and essentially eliminates the risk that the we build our technology atop a disappearing or deprecating foundation.

WebAssembly is a promising target as well, and Tweag has just merged a WebAssembly backend into GHC (great work and congrats!). WebAssembly is not as ubiquitous as JavaScript yet, and has a harder time interacting with JavaScript directly. Hence, we believe that the WebAssembly and JavaScript backends provide different strengths, and it is to the Haskell community's benefit to have and support both code generation paths in GHC for different use cases and requirements.

Why Haskell?

JavaScript has many problems ranging from the downstream effects of early design decisions (that inhibit programmer productivity and are subtle bug generators), to ecosystem security issues, to fundamental issues with asynchronous and concurrent programming.

These issues are problematic for our product domain. At IOG, a central engineering requirement is to create a code base that has a high degree of correctness. Haskell makes this easy; or to get a little technical, the combination of Strong Static Hindley-Milner based typing allows us to write performant, correct, and maintainable code. In addition to this, many of the problems that occur in JavaScript are simply not expressible because of Haskell's type system and concurrency offerings.

There are, of course, competitors: PureScript targets Javascript and provides a programmer experience close to Haskell's. The benefit of using Haskell instead is code sharing: we can write the front-end of a web app in Haskell that compiles to JavaScript and the back-end in Haskell that compiles to machine code. In particular, the (de)serialization code (e.g. from/to JSON) is shared and cannot get out of sync between the front-end and the back-end.

Why a GHC backend?

Haskell is a language driven by its implementation in GHC. GHC development is very active and GHC does not define a stable interface for compiler backends that are independently maintained, which means that maintaining an out-of-tree backend is costly.

The maintenance burden is not hypothetical; our teammate Luite Stegeman has been developing a fork of GHC that emits JavaScript, called GHCJS, for close to 10 years and has experienced the pain first hand. Any changes to upstream GHC had to be adapted to the customized fork or GHCJS would fall behind. And fall behind it did: at the time of writing, GHCJS has stuck to using GHC 8.10, lagging behind by three major releases and counting.

Similarly, the Eta compilerwhich is targeting the JVMfaced the same issues and appears to be discontinued (compatibility with GHC 7.10.3's Haskell from 2015 is mentioned).

Compounding the issue, the normal Haskell toolchain was not designed for an edge case like GHCJS. So GHCJS required that the normal tooling, e.g., Cabal and Stack, could distinguish between GHC and GHCJS compilers. This meant that the GHCJS developers had to maintain the GHC fork, develop GHCJS, and patch or contribute to Cabal and Stack. Simply put, the maintenance burden was much too high per developer. Examples of differences between GHCJS and GHC:

  • GHCJS had a double versionits own version and the version of GHC it was based onand build tools had to deal with both
  • GHCJS used non-standard file extension (e.g. .js_o and .js_a for objects and static libraries respectively) and custom file formats (still true for .o but no longer true for .a)

So instead of spending engineering time and energy responding to ecosystem changes and maintenance, the DevX team decided the best course of action was to enhance GHC's cross-compilation support and add a proper JavaScript backend based on GHCJS. We feel that this adds value to the entire Haskell ecosystem, keeps the JavaScript backend in sync with GHC, provides a better user experience for all, reduces maintenance costs, and greatly improves the backends in GHC in general. By implementing support for a JavaScript backend in GHC, we also improve GHC's support for cross-compilation (and testing cross-compilers), which is directly applicable to the WebAssembly, iOS, and Android backends in GHC.

Is GHCJS Dead?

Not yet! As it stands, the JavaScript backend doesn't provide all the features provided by GHCJS. In particular it doesn't support Template Haskell and we've removed the extended GHCJS FFI syntax to refine its design. See our roadmap below for more details.

Nevertheless GHCJS is unlikely to be updated to use a GHC version more recent than 8.10.x. So from our point of view it is in maintenance mode until the JavaScript backend totally subsumes its features. New maintainers who want to continue the development of GHCJS until its feature set has been fully subsumed by mainline GHC are of course welcome.

What is Missing From GHCJS?

The JavaScript backend borrows a lot of code from GHCJS, but not all of it. Here are the main differences between GHCJS and the JavaScript backend:

  1. GHCJS was stuck on GHC version 8.10 while the JavaScript backend follows GHC HEAD.

  2. GHCJS's incremental linking support ("base" bundles) hasn't been ported. This feature required too many changes (such as adding new command-line flags) and would have been backend-specific. This might be implemented in the future if it proves to be useful for the newer Template Haskell implementation, for example.

  3. GHCJS's JavaScript code optimizer hasn't been ported. The code was trying to do too much all at once and consequently was fragile and slow. We plan to work on an intermediate representation between STG and JavaScript to perform the same optimizations with better performance, maintainability, and reliability.

  4. GHCJS's compactor (link time optimizations) code hasn't been ported. Some optimizations have been reimplemented (e.g. global renaming of local identifiers), but some other are lacking (e.g. compacting initialization code). We plan to work on this as part of a larger effort on refactoring the code generator, the linker, and some aspects of the runtime system. More details are available in GHC issue #22352.

  5. GHCJS's hacky support for plugins hasn't been ported. Instead we implemented a new way to load plugins from shared libraries that works in any GHC cross-compiler. See #20964 and !7377.

    The common and convenient approach to load plugins still isn't supported by GHC when it is used as a cross-compiler (see #14335 for more details).

  6. GHCJS's support for Template Haskell hasn't been ported. GHCJS had its own implementation of an external interpreter (THRunner) which has been used as an inspiration to implement GHC's external interpreter (IServ). While serving the same purpose, IServ is quite different from THRunner and can't be directly used as a substitute for it. Retrofitting THRunner into Iserv is our next priority. More details on https://engineering.iog.io/2022-05-17-javascript-template-haskell-external-interpreter

  7. GHCJS supported an extended FFI import syntax allowing Javascript code to be inlined (the FFI import string supports templates of Javascript code with placeholders for arguments). This hasn't been ported because adding a JavaScript parser to GHC was difficult and complex, and the imported code made no safety guarantees whatsoever. For now, only JavaScript function calls are supported.

  8. Any command-line flag introduced by GHCJS has not been ported. We didn't make any change to GHC's command line in this work except for adding a -ddump-js flag. Other options will be added later as needed.

  9. The JavaScript backend itself hasn't been optimized and we even removed some undocumented uses of NFData from GHCJS's code. We intend to optimize the JavaScript backend in a principled way (e.g. by first gathering evidence with profiling).

What's on the JS Backend's Roadmap?

Our top priorities are:

  • Implementing Template Haskell support.
  • Reducing generated JavaScript code size.
  • Modernizing the generated JavaScript code. The code generator adapted from GHCJS does not use more modern JavaScript features such as fat-arrows (=>), symbols and let bindings. We aim for the JavaScript backend to emit JavaScript that comports with ECMA-262.
  • Enhancing the run-time performance of the generated code

What has Improved Compared to GHCJS?

Or, why did it take you so long to port a stripped GHCJS into GHC? While it may seem like such a task should be relatively quickespecially in a language with such a good refactoring story like Haskellthere were numerous road blocks that we needed to remove before adding the backend. In particular, here were the troublesome bits:

Removing the Use of External Libraries

GHCJS used libraries that aren't already dependencies of GHC, such as text, lens, attoparsec, and aeson. As we didn't want to add new dependencies to GHC, we've refactored the code to avoid them. Examples:

  • we've replaced Text with GHC's ShortText (which provides a similar API) and finally with GHC's FastString in most cases (which is usually more performant).
  • we've replaced a lot of lens-heavy code with its non-lens equivalents, because GHC does not use lenses itself, and a design requirement was to stay within existing code conventions.
  • we've replaced pretty with GHC's pretty-printer (SDoc, etc.).
  • we've replaced binary with GHC's Binary instances.

GHCJS used to provide its own base and prim libraries: ghcjs-base and ghcjs-prim. We've merged those into the existing base and ghc-prim libraries.

Reusing GHC's Build System: Hadrian

GHCJS has a reputation for being complex to build. It relied on custom build scripts to deal with the GHC fork it uses. The JavaScript backend however is as easy to build as any other GHC. It doesn't require any wrapper script, only the emconfigure tool provided by the Emscripten project.

With a fresh checkout of the GHC source tree, you can now build a GHC with the JavaScript backend with just these commands:

> ./boot
> emconfigure ./configure --target=javascript-unknown-ghcjs
> ./hadrian/build --bignum=native -j

Note that if this doesn't work, up to date instructions and troubleshootings can be found on https://gitlab.haskell.org/ghc/ghc/-/wikis/javascript-backend/building

The Hadrian build system has been adapted to support Cabal's js-sources stanzas that are to support user-provided .js files. Both the rts and base packages required this feature.

Support for Running GHC's Test Suite

GHC's entire test suite can now run and check the JavaScript backend! We had to tweak Hadrian to make this possible (to make Hadrian cross-compiler aware), but the test suite has already found some bugs that we have since fixed.

However, in order to merge for the GHC 9.6 release we had to disable many tests because of missing features (Template Haskell, Haskell Program Coverage (HPC), compact regions, etc.) or because the generated code would time out (not surprising given the missing optimizer and compactor).

But in the process of disabling those tests we've laid a good path forward. We've added more precise properties to the test suite, which indicate the required features to run each test. So when we implement some feature, it will be painless to re-enable all its tests. In addition, failing tests now have proper tickets in GHC's issue tracker.

We've spent some time trying to run the test suite on CI but this work wasn't ready in time to be included in the initial commit with the rest of the backend. For now, only some basic testing is done on CI: compiling a non trivial program that uses the GHC library into JavaScript and executing it. Nevertheless, we have a merge request in the works so that future contributions should be properly validated by running the test suite on CI soon.

For the time being, the following command will run the test suite locally:

./hadrian/build --bignum=native -j2 test

We use -j2 to avoid running too many tests in parallel as this could allocate too much memory and fail, which isn't surprising as the JavaScript backend hasn't been optimized for memory usage yet.

Upgrading from GHC 8.10 to GHC 9.6

The latest version of GHCJS is based on a fork of GHC 8.10.7. We spent a significant amount of time adapting the code generator to support GHC HEAD. In practice this meant:

  • Adding support for new primops, especially sized primitives.
  • Adapting to ghc-bignum changes.
  • Adapting to internal changes.
  • Fixing support for polymorphism in kinds.
  • Fixing support for unlifted newtypes.
  • Fixing support for unboxed sums.
  • Many other fixes...

Fixing Some Performance Issues

As we haven't ported GHCJS's Compactor, output size was predictably incredibly large. So we've spent time re-implementing a crucial piece of the Compactorrenaming and shortening of local variablesusing a different approach. Our new approach ended up being faster than GHCJS's compactor. For the GHC devs out there, as we first replaced the Text type with the FastString type, the newer Compactor can now replace a FastString-based identifier with a new identifier derived from the FastString's Unique in constant time.

Removal of Custom File Extensions and Support for JavaScript Pragmas

GHCJS used the .js.pp file extension to identify JavaScript files that needed to be passed through CPP before being valid JavaScript. Adding support for this extension in both Hadrian and GHC proved to be more work than just adding support for JavaScript pragmas. So we decided to do the latter; similarly to Haskell extension pragmas, you can now write //#OPTIONS: CPP in your JavaScript files to enable the CPP pass, and the file extension is always .js.

While we're on the topic of file extensions, technically .js files don't have to be compiled into .o files (contrary to C/C++/Haskell/etc. files) at all. However, build systems (Hadrian, Cabal...) and compilers (GHC) expect this. So for consistency with other backends, we've added a fake compilation pass for .js files too. They are now renamed into .o files with a //JAVASCRIPT header added to distinguish them from object files produced by the JavaScript backend (and from Emscripten, in the future).

Cleanup and Documentation

GHC provides some utilities (pretty-printer, binary serialization, string interning, etc.) that GHCJS did not make use of. So we adapted the GHCJS code to exploit these utilities, keep the JavaScript backend similar to other backends, and for better performance.

Three of us (out of four) were totally new to GHCJS's code base. We strived to grok the code and to make it understandable by adding a lot of comments and refactoring. Throughout this process we logged our learning in our engineering blog to explain some (sadly not all) technical details about GHCJS's internals:

Plugin Support in Cross-Compilers

GHC doesn't support plugins when built as a cross-compiler (cf #14335). This is because it cannot yet support two environments at once: one for the target code (JavaScript code here) and one for the host (e.g. native x86 or AArch64 code for the plugin). We've spent a lot of time making it more modular (see the Modularizing GHC white paper we published earlier this year and Sylvain's lightning talk at HIW 2022) but there is a lot more to do to achieve this (cf #17957).

GHCJS used a fragile hack to support plugins: at plugin loading time it would substitute the plugin unit with another corresponding one from another package database (For the non-GHC devs out there interested in GHC Units see this note). This was fragile because it could violate GHC's single environment assumptions.

GHCJS's hack did not get ported. Nevertheless we have implemented a new way for GHC to load plugins directly from libraries instead of packages (#20964/!7377). This method doesn't require GHC to load module interfaces for the plugin and its dependencies, hence workarounds GHC's limitations.

What About Libraries Using C Sources?

Libraries that use C sources (c-sources Cabal stanza) aren't supported by the JavaScript backend. In the future we plan to use Emscripten to compile C sources and then to generate some adapter code for them, but this isn't done yet.

For now, there are two ways to fix libraries that use C sources. The C code can either be rewritten in Javascript, or it can be rewritten in Haskell. Then it is possible to use Cabal predicates (e.g. arch(js)) to select between the different versions.

We do have a preference for writing pure Haskell versions because it is more future proof. For example if someone adds some new backends for Lua, Java, CLR, etc. then the Haskell version can be directly compiled by that backend and there is no extra work. In contrast, if the C source is rewritten in JavaScript, then it would need to be rewritten for each backend.

That is the approach we've taken when we wrote the ghc-bignum library. Ghc-bignum provides a "native" implementation written in Haskell that is functionally equivalent to the GMP based implementation. Of course, besides being more future proof the Haskell version is just more pleasant to write than the Javascript version.

Note that GHCJS came with a "shim" library where a shim is JavaScript source code specifically for some package. For example, GHCJS provided shims for packages like text, process, and hashable. We do not intend the JavaScript backend to provide shims so these JavaScript sources will have to be upstreamed or reimplemented in Haskell.

Note that the linking behavior is different due to the interpreted nature of Javascript. In the JavaScript backend, we can link with libraries using foreign imports even if the imported functions don't exist. Instead of failing at link time (which is what usually happens with native code) a JavaScript exception is raised only when and if the imported function is called.

How to Help?

We have now reached our first milestone; anyone can easily build and test the JavaScript backend, and anyone can open bug reports or offer patches for the JavaScript backend on GHC's GitLab.

For those who offered their help this year: thank you! Until now it was difficult to split the work into independent tasks (one fix led to a new failure, which led to an architectural issue, etc.) and it was difficult to coordinate with people outside of our team. However, we're now in a much better position to discuss suggestions and to test/review patches in the spirit of open source.

tl;dr Just Tell Me How to Say Hello World

You need:

  • Emscripten version 3.14 or better. Be sure that your emscripten is bundled with either LLVM 15 or an up to date, patched LLVM 14.
  • Nodejs, latest stable version. Only if you want to run the compiled JavaScript with node.

Most Linux distributions will have the necessary LLVM patches. If you're on NixOS, you'll need to use llvm_git and hope for the best. This fork of ghc.nix will also be useful to you.

Checkout the GHC source

git clone --recurse-submodules https://gitlab.haskell.org/ghc/ghc.git
cd ghc # ensure you are in the ghc source tree for the following commands

Update the submodules

git submodule update --init --recursive

Boot and Configure for JavaScript

./boot && emconfigure ./configure --target=js-unknown-ghcjs

You should see configure finish and report something similar:

----------------------------------------------------------------------
Configure completed successfully.

Building GHC version : 9.5.20220819
Git commit id : 08c3c4783c72d3173d79ccda2ac282e2d3e04e34

Build platform : x86_64-unknown-linux
Host platform : x86_64-unknown-linux
Target platform : js-unknown-ghcjs

Bootstrapping using : /nix/store/4bkmkc7c98m4qyszsshnw9iclzzmdn4n-ghc-9.2.3-with-packages/bin/ghc
which is version : 9.2.3
with threaded RTS? : YES

Using (for bootstrapping) : /nix/store/yzs8390walgk2rwl6i5li2g672hdn0kv-gcc-wrapper-11.3.0/bin/cc
Using clang : /nix/store/p894nlicv53firllwgrfxfi51jzckh5l-emscripten-3.1.15/bin/emcc
which is version : 15.0.0
linker options :
Building a cross compiler : YES
Unregisterised : NO
TablesNextToCode : YES
Build GMP in tree : NO
hs-cpp : /nix/store/p894nlicv53firllwgrfxfi51jzckh5l-emscripten-3.1.15/bin/emcc
hs-cpp-flags : -E -undef -traditional -Wno-invalid-pp-token -Wno-unicode -Wno-trigraphs
ar : /nix/store/p894nlicv53firllwgrfxfi51jzckh5l-emscripten-3.1.15/bin/emar
ld : /nix/store/p894nlicv53firllwgrfxfi51jzckh5l-emscripten-3.1.15/bin/emcc
nm : /nix/store/0dp0bfg9sncg7bjy389zwyg2gskknm6b-emscripten-llvm-3.1.15/bin/llvm-nm
objdump : /nix/store/zgvxnf9047rdd8g8kq2zxxm9k6kfqf8b-binutils-2.38/bin/objdump
ranlib : /nix/store/p894nlicv53firllwgrfxfi51jzckh5l-emscripten-3.1.15/bin/emranlib
otool : otool
install_name_tool : install_name_tool
windres :
dllwrap :
genlib :
Happy : /nix/store/ijdmyaj6i6hgx5ll0lxxgcm9b0xn8nma-happy-1.20.0/bin/happy (1.20.0)
Alex : /nix/store/qzgm2m7p7xc0fnyj4vy3jcmz8pvbg9p7-alex-3.2.6/bin/alex (3.2.6)
sphinx-build : /nix/store/27dk5i52465a4azjr2dqmrhyc0m4lpf2-python3.9-sphinx-4.5.0/bin/sphinx-build
xelatex : /nix/store/8jc2258h4nqzqjy303zzkssd3ip675pf-texlive-combined-2021/bin/xelatex
makeinfo : /run/current-system/sw/bin/makeinfo
git : /nix/store/vsr2cn15h7cbwd5vqsam2ab2jzwfbyf9-git-2.36.0/bin/git
cabal-install : /nix/store/cjmd2qv1b5pdw4lxh1aw4xwwy4ibnb2p-cabal-install-3.6.2.0/bin/cabal

Using LLVM tools
clang : clang
llc : llc
opt : opt

HsColour was not found; documentation will not contain source links

Tools to build Sphinx HTML documentation available: YES
Tools to build Sphinx PDF documentation available: YES
Tools to build Sphinx INFO documentation available: YES
----------------------------------------------------------------------

Be sure to verify that ar, ld, nm and friends point to the emscripten versions, i.e., the output shows <tool> : <some-path>-emscripten-<tool>.

Build the JavaScript backend

./hadrian/build --bignum=native -j

Now Compile Hello World

module Main where

main :: IO ()
main = putStrLn "Hello JS!"
$ <path-to-ghc-root-dir>/_build/ghc-stage1 -fforce-recomp Main.hs
$ ./Main
$ Hello JS!

Under the hood Main is just a JavaScript program written as a script with nodejs as the interpreter. This means you can treat the compiled program like any other JavaScript program: loading it into JavaScript tooling or hack on it by hand. This also means that all compiled programs, such as Main, are human-readable, for example here are the first ten lines:

$ head Main
#!/usr/bin/env node
var h$currentThread = null;
var h$stack = null;
var h$sp = 0;
var h$initStatic = [];
var h$staticThunks = {};
var h$staticThunksArr = [];
var h$CAFs = [];
var h$CAFsReset = [];
var h$regs = [];

The program begins with a shebang instructing the operating system to send the rest of the file to nodejs. The remaining lines are our actual program, which starts with global variables that the runtime system, garbage collector, and scheduler need. Now human-readable is not the same as easy to understand, for example here is the logic that implements a Maybe:

function h$baseZCGHCziMaybeziJust_con_e() { return h$rs() };
function h$baseZCGHCziMaybeziJust_e() {
var h$$13be2042 = h$r2;
h$r1 = h$c1(h$baseZCGHCziMaybeziJust_con_e, h$$13be2042);
return h$rs();
};
function h$baseZCGHCziMaybeziNothing_con_e() { return h$rs() };

If you would like to understand this code and how the JavaScript backend works in general please see our other blog posts. In any case, we invite you to try it out, hack, and be merry!

Acknowledgements

We want to thank Jan Hrcek, and David Thrane Christiansen for their time, labor, comments, and suggestions on drafts of this blog post.

· 9 min read

Introduction

I recently gave a short presentation about heap objects representation in GHCJS and hence in the upcoming JS backend for GHC. This post is a summary of the content.

Heap objects

GHC implements Haskell code evaluation by using graph reduction. As such Haskell programs compiled by GHC use the heap to store nodes of the graph to be reduced and utility nodes participating in graph reduction. These nodes are:

  • FUN: functions with their free variables as payload
  • THUNK: suspensions with their free variables as payload
  • PAP: partial application to a FUN. FUN closure and already applied arguments as payload.
  • IND: indirection to another heap object
  • BLACKHOLE: used to overwrite a THUNK when it is being evaluated

The heap is also used to store other values:

  • CON: boxed values (saturated constructor applications) with field values as payload
  • Other unlifted values: TSO, BCO, arrays, MutVar#, MVar#, TVar#, stacks, stack frames...

Info tables

Many heap objects share the same properties: e.g. all Int CON objects are exactly the same except for their payload (the Int# value) that may be different. Hence heap objects are split in two parts to allow sharing of common properties:

  • info table: statically known properties (at compilation time) that can be shared by several heap objects
  • heap object itself: dynamically allocated in the heap

Heap objects always have the same layout in the native code generated by GHC. They are composed of:

  • a pointer to an info table
  • some words of payload

Heap traversal is done by following the info table pointer of every heap object to query in the info table the layout of the heap object payload.

Info tables contain a pointer to a function called "entry code" that can be specific to each info table. This code is mainly used to apply a node to some arguments. Note that with tables-next-to-code optimisation enabled, to avoid an indirection the info table pointer is actually a pointer to this entry code and the info table itself is stored in the words preceeding the entry code.

Heap objects in JavaScript

GHCJS represents most heap objects with a JavaScript object having the following fields:

{ f, d1, d2, m, cc }

One question I had was: why don't we use a JS array instead of a JS object? Arrays should be faster than objects (i.e. hashmaps), no? It turns out that objects like this are optimised by JS engines using "hidden classes" (see https://v8.dev/blog/fast-properties for an explanation). That's why they are usually more efficient than arrays for which bound checking must be made. Also arrays are larger in memory because they need to store their size.

Let's now discuss the fields of the heap objects.

f field

"f" is the equivalent of the info table pointer. It contains a JavaScript function that is the entry code for the heap object.

Similar to the tables-next-to-code optimisation discussed above, we use the fact that JS functions are objects which have properties to store the info table fields as properties of the function itself.

Example of an info table / entry function:

[Function: h$entry_function_xyz]
{ t // (Int) object type
, size // (Int) number of fields in payload (-1 if variable layout)
, i // (Array) fields layout (empty if variable layout)
, n // (String) object name for debug
, a // (Int) function arity or constructor tag
, r // (Int) arity in number of JS variables
, s // (Array) static refs that must be kept alive (SRT)
, m // GC mark
}

d1, d2 fields

The d1 and d2 fields contain the payload of the heap object: constructor fields, function free variables, etc.

Payloads can be composed of zero, one, or many fields. A naive solution would be to have one JS object field (d1, d2, d3...) per payload field. However it would be bad for two reasons:

  • performance: JS engine hidden classes optimisation mentioned above needs objects to have the same field structure.

  • genericity: we couldn't write generic functions (e.g. to copy a closure) without dynamically querying the number of fields composing the payload.

Another solution would be to use a single field to store the whole payload. It would fulfill the genericity constraint. However performance may not be good because of the extra allocation of the object containing the payload and the indirection to access its fields.

Instead GHCJS uses a middle ground approach: it always uses only two JS object fields to store any number of payload fields. The following encoding is used to stash any number of payload fields into two JS fields:

Payloadd1d2
[]nullnull
[a]anull
[a,b]ab
[a,b,c]a{d1=b,d2=c}
[a,b,c,d...]a{d1=b,d2=c,d3=d...}

It still fulfills the genericity constraint and small objects (up to two fields of payload) don't pay for an extra allocation/indirection. The price to pay is that two fields of payload are always allocated, even for for objects with 1 field of payload.

It would be interesting to benchmark the performance of the different payload representations.

m field

The "m" field is used both for reachability checking (~ garbage collection) and to implement the "stable names" features.

GHCJS can't rely on the JS engine to know when a heap object is collected. So it implements its own heap traversal algorithm for this. The "m" field is used as a marker for this algorithm (it will be the topic of a future blog post). In this case, the "m" field is a number (a GC mark).

When a StableName is created for an object, the "m" field of the object is updated to point to the StableName object:

[h$StableName]
{ m // GC mark
, s // stable name unique id
, ...
}

The "m" field of the StableName object is used in replacement of the mark of the object.

cc field

The "cc" field is the cost center associated to the heap object. This field is only present when profiling mode is enabled. Cost centers are entered (pushed on the cost center stack of the current thread) before the evaluation of thunks and function applications.

Cost centers are allocated with the h$CC function.

Other heap object representation

The generic heap object representation presented above is only used for some objects: those involved in graph reduction (e.g. updatable objects) and values that don't have a fixed layout (e.g. CON objects have different layouts depending on which constructor they represent). The object layout allows generic access to the infotable and to the payload, and the infotable describes the object type and the payload layout.

Several other objects don't need this machinery: they always have the same layout and are never the result of a reduction (they are unlifted values). These objects are represented as JS objects with any fields they need (i.e. not using the d1/d2 encoding above). To determine the type of such heap objects, instead of using the "type" field of an infotable the code uses the instanceof operator. For example a TSO is represented as a h$Thread object.

Note that we could be tempted to give every heap object a different object name and to always use instanceof instead of the infotable "type" properties. It would mean adding h$Con, h$Thunk, h$Fun, h$Pap, h$Blackhole, and h$StackFrame objects. Then all the heap objects could be treated in the same way. However the isssue is that these objects need to be overwritable in place: a Thunk becomes a Fun/Con/Pap/Blackhole, etc. As far as I know we can't update the "instance" of an object, so all these object have to be instances of the same JS object.

Also note that the JS backend doesn't need INDirection nodes because it can always overwrite the fields of a JS object with the fields of another to update a closure. For the record, indirection nodes are needed in backends that layout closures as a chunk of bytes/words and when the size of the closure to update is smaller than the size of the updatee closure.

Automatic unboxing

Sometimes the generic heap object representation is unnecessary. For example, a boxed Int would be represented as a CON heap object with the Int# in its payload, represented as a JavaScript number value. The only thing we can do with this heap object is to pass it around and to extract its payload. As such, it is more memory efficient to directly pass the payload (a JS number).

GHCJS provides an optimisation that consists in automatically unboxing some CON heap objects. For example, Haskell booleans (True and False datacons) are directly mapped to JavaScript booleans, boxed numbers (Float, Double, Int, Word, Int8, etc.) are directly mapped to JavaScript numbers.

We can do this because JavaScript already provides some boxing of its own: we can use the typeof operator on a heap object to know if it is a JS object, a JS number, a JS boolean, etc. It makes it possible to distinguish between heap object representations. In comparison, we can't do this with the native (non-JS) backend when we only have a pointer to a heap object: the pointer doesn't carry the kind of value it points to, hence the pointed memory location must be generic enough for this introspection to be performed (e.g. using infotable pointers).

Summary

Heap object can be represented as JS values (number, boolean) because of the automatic unboxing, or as JS objects: discimination is done with the typeof operator.

Heap objects represented as JS objects come in two flavours:

  • unlifted objects are represented with specific JS objects, disciminated with the instanceof operator
  • other objects use the following generic and updatable structure:
{ f, d1, d2, m, [cc] }

· 8 min read
  1. The Design Space
  2. GHCJSs FFI
  3. Lightweight safety checks
  4. Returning multiple values
  5. Changes in the FFI System for the JS Backend

Users of GHCJS enjoyed a rich FFI system for foreign JavaScript imports. However, this has changed during our adaptation of GHCJS to GHC 9.x. This short post goes over the GHCJS FFI system, the motivation for these changes and what the changes are. First, we must consider the design space of an FFI system.

The Design Space

FFI code is typically employed in high performance scenarios. Additionally, users of the FFI do not want to deal with the object language the compiler is compiling to. Instead, users want a simple way to call functions from the object language and use them in their own code as normal Haskell functions. However, users of the FFI system do tend to be power users, and so as a design principle we want to expose the tools they need to achieve their performance needs, whatever those needs may be. We can summarize these constraints as follows:

  1. The FFI must abstract the JavaScript backends infidelities away as much as possible. That is, users of the FFI should need to worry about the Int64# representation, but should also be able to simply follow standard patterns we have written in base.
  2. The FFI must provide tools to achieve high performance code, even if those tools require up front knowledge of the runtime system to use. However, these tools should not be in the path of least resistance to use the FFI system.
  3. The FFI must provide a lightweight specification that users program against for the JS backend to optimize the imported function and for good error messages for users.

GHCJSs FFI sets a high (qualitative) benchmark on these three constraints. Lets inspect them each in detail, in no particular order.

GHCJSs FFI

In GHCJS, a user could take advantage of JavaScript functions in their Haskell code using the GHCJSs FFI. However, the syntax was unique to GHCJS with place holder variables like one might see in perl, nix, or bash. For example, here is a foreign import from the base library for st_size:

-- base/System/Posix/Internal.hs
-- the JS FFI version
foreign import javascript unsafe "$r1 = h$base_st_size($1_1,$1_2); $r2 = h$ret1;"
st_size :: Ptr CStat -> IO Int64

The syntax is different from what we know and love in the normal Haskell world but the grammar is straightforward. We declare a foreign import from javascript, state that the import is unsafe or interruptible and then provide a string, h$base_fstat(...) for the code generator to use when compiling. Compare this with the C version:

-- base/System/Posix/Internal.hs
-- the C FFI version
foreign import ccall unsafe "HsBase.h __hscore_st_size"
st_size :: Ptr CStat -> IO Int64

And we see that they are similar. The only difference is the strange $n symbols in the referrent string. Contrast this with the C version, which simply declares a name.

These symbols are place holder variables with special meaning in GHCJS. There are two intractable reasons for the placeholder patterns. First, we require these patterns to work around the limitations of JavaScript as a backend (1). For example, consider the case where we need to return an Int64# from an imported foreign function. In C and Haskell this is not a problem because both can represent Int64# natively, however JavaScript only has native support for 32-bit values. Thus, to be able to return an Int64# we need to have a method to return two 32-bit numbers. Similarly, in order to apply a function to an Int64# that function must take at least two arguments, one for the high bits and one for the low. Second, the referrent string is untyped and can contain arbritrary JavaScript code. So placeholder patterns provide a simply and lightweight way for safety checks and eliminate classes of untyped, hard to understand errors. For example, consider an arity mismatch error between a function definition and call site. When this happens JavaScript happily continues processing with the return value from the function application defined as NaN (of course). Such arity conflicts can easily occur, especially when dealing with 64-bit values which require function arity assumptions.

Lightweight safety checks

Lightweight safety checks (3) are done by GHCJS by parsing the names of the place holder variables; each of which follows a specific naming convention. This convention is:

  • Argument types:
    • $n: Used for unary arguments, i.e., arguments which require only a single register.
    • $n_n: Used for binary arguments, i.e., arguments which require two registers.
    • $c: A continuation argument, only valid for interruptible foreign functions.
  • Return types:
    • $r: a unary return
    • $r1, $r2: a binary return
    • $r1, $r2, $r3_1, $r3_2: unboxed tuple return
  • Top level patterns:
    • "&value": simply emitted as value by the code generator
    • "someFunction": emitted as ret = someFunction(...), i.e., map the FFI to the result of the function call.
    • "$r = $1.f($2)": emitted as r1 = a1.f(a2), i.e., a combination of a function call and a property access.

With this standard GHCJS then parses the FFI referrent string to ensure that it conforms to this standard. If not then GHCJS can at least respond to the user with an ill-formatted FFI message and say precisely where the issue is. For example, it could respond that only half of an Int64# is returned based on the referrent string and the function type.

Returning multiple values

But what of performant code? GHCJS achieves performant FFI by not trying to abstract away from the runtime system. Instead, an advantage of GHCJSs FFI is that we can specify exactly which registers the foreign function should dump its results or even arbitrary global variables. This places more burden on the user of the FFI in specific scenarios, but crucially allows the FFI system to get out of the way of the user. The FFI system also exploits this capability to return multiple values from a single function call, which is a common need when compiling to JavaScript. For example, in the above code st_size is declared to return an IO Int64, the JavaScript handler h$base_st_size returns the Int64 using two registers $r1 and $r2, but does so through the use of a special purpose global variable called h$ret1:

function h$base_st_size(stat, stat_off) {
h$ret1 = (stat.i3[(stat_off>>2)+2]);
return (stat.i3[(stat_off>>2)+1]);
}

The function inputs a pointer and an offset. Pointers in GHCJS are simply pointers to ByteArrays so the function indexes into the ByteArray and retrieves and stores the lower 32-bits in h$ret1, then returns the higher 32-bits directly. These results are picked up by the FFI code, which performs assignment to set $r1 to the result of the function call (the higher 32-bits), and set $r2 to the value of h$ret1 (the lower 32-bits). Crucially, the runtime system needs to do nothing. The registers are already handled ready to be consumed by whatever the caller of the foreign function will do.

One might consider using a simpler design, which trades register juggling for a more straightforward representation such as a ByteArray which stores the Int64#. However, such a design would trade speed for implementation simplicity. If we passed ByteArrays then each foreign function would spend time wrapping and unwrapping the array to get the payload; clearly an undesirable outcome for high performance code.

Changes in the FFI System for the JS Backend

So we see that GHCJSs FFI system actually performs quite well in the design space. Power users are well supported and can leverage enough unsafety to bind global variables like h$ret1 and specific registers such as $r1. The system provides some lightweight checking through parsing. The nuances of the JavaScript platform are generally abstracted over and the FFI system is tuned for performance critical scenarios. So why change it?

The short answer is to hit deadlines. By skipping the FFI parsing the JS Backend team was able to produce a working (can output Hello World!, and compile GHCs boot libraries), integrated, JS backend in GHC faster than had we finished the FFI system.

For the time being, we have opted to replaced each foreign function call with a JavaScript fat arrow, for example:

foreign import javascript unsafe "(($1_1,$1_2) => { return h$base_st_size($1_1,$1_2); })"
st_size :: Ptr CStat -> IO Int64

Of course, this situation is untenable, as argued above, FFI code is assumed to be used in performance critical code, and thus any extra overhead, such as a function closure and consequent indirection, must be avoided. But fear not! In the near future well be overhauling the FFI system and returning it to its former glory.

· 2 min read

This is the July 2022 monthly update from the GHC DevX team at IOG.

JavaScript Backend for GHC

For a few months we have been merging GHCJS (Haskell to JavaScript compiler) into GHC. We set our first milestone to be the ability to compile and to run the usual "Hello World" program. This month we finally reached it!

We are now focusing on:

  • fixing failing tests in GHC's testsuite (~2800 unexpected failures). To do that, we have to implement new primops, to fix bugs we introduced while we ported the code from GHCJS, etc.

  • implementing support for the "js-sources" Cabal stanza in Hadrian. Currently the JS backend finds the JS sources required for the RTS and for base into explicitly defined location. It was only a stop-gap measure and we now need to implement proper support for user-provided JavaScript files.

  • documenting and refactoring the source code and making it similar to other GHC modules. As an example, GHCJS used the text package which isn't a boot package. Hence we first switched to use GHC's ShortText implementation and now we switched to a FastString based implementation.

  • adding back GHCJS's features that we haven't ported for some reasons (e.g. the compactor, TH, etc.).

You can follow our progress on our development branch here.

Blog posts

For the time being, we will focus blog post topics on GHCJS internals and related topics. A few of these blog posts are currently under review and should be published shortly.

· 6 min read

Introduction

I recently gave a short presentation on the workings of the GHCJS linker. This post is a summary of the content.

JavaScript "executables"

The task of a linker is collecting and organizing object files and resources into a loadable library or executable program. JavaScript can be run in various environments, for example the browser or node.js, and not in all of these the concept of an executable makes sense.

Therefore, when we link a Haskell program, we generate a jsexe directory filled with various files that allow us to run the JavaScript result:

FileDescription
out.jscompiled/linked Haskell code
out.frefs.*list of foreign calls from out.js
out.statssource code size origin statistics for out.js
lib.jsnon-Haskell code, from js-sources in packages and RTS. possibly preprocessed
rts.jsgenerated part of RTS (apply functions and similarly repetitive things)
runmain.jssingle line just starts main
all.jscomplete runnable program, created by combining out.js, lib.js, rts.js and runmain.js

Most of the work done by the linker is producing out.js, and that's what we'll be focusing on in the next sections.

Building out.js

The linker builds out.js by collecting all code reachable from main (and a few other symbols required by the RTS) and generating the required initialization code for all top-level data. The code is found in object files. These object files have the following structure:

SectionDescription
Headerversion number and offsets of other sections
String tableshared string table, referred to by Dependencies and Code, to avoid duplication in file and memory
DependenciesDependency data, internally between binding groups and externally to symbols in other object files
CodeCompiled Haskell code stored as serialized JavaScript AST and metadata. Code is organized in binding groups

The object files contain binding groups of mutually dependent bindings. These are the smallest units of code that can be linked. Each binding group has some associated metadata required for initialization of the heap objects in the group. The metadata contains for example constructor tags (e.g. 1 for Nothing, 2 for Just), the arity of functions and static reference tables.

From a high level, the procedure that the linker follows is this:

Step
Read object files from dependencies into memory
Decode dependency part of all object files in dependencies (includes reading the string tables)
Using dependency data, find all code reachable from main
Decode reachable binding groups
Render AST to JavaScript
Construct initializers from metadata

We avoid decoding (deserializing) the binding groups that do end up in the linked result to keep the memory consumption lower. Still the linker requires a lot of memory for larger programs, so we may need to make more improvements in the future.

The Compactor

The compactor is an optional link-time transformation step that reduces code size. It consists of a lightweight (i.e. no expensive operations like dataflow analysis) rewrite of the code contained in the object files. The compactor is disabled when linking with the -debug flag. There are a few steps involved.

Renaming private symbols

Haskell names are quite long by default: they need to be globally unique, hence they contain their defining unit-id and module name. For example: mtl-2.2.2-somehash-Control.Monad.State.Lazy.execState_go1 (special characters would be z-encoded but it isn't shown here).

Private symbols are only referred to from within the same module. It doesn't matter which JavaScript name we pick for them, as long as there is no overlap between the names from different modules. The compactor renames all the private symbols using a global sequence to ensure short names that do not overlap.

Block Initializer

Without the compactor, the linker generates an h$initObj initialization call (or h$o) call for each global Haskell heap value. The code for this can get quite big. The compactor collects all heap objects to be initialized in a single large array and encodes the metadata in a string. This makes the initialization code much more compact.

Deduplication

An optional step in the compactor is deduplication of code. When deduplication is enabled with the -dedupe flag, the compactor looks for functionally equivalent pieces of JavaScript in the output and merges them. This can result in a significant reduction of code size.

Incremental Linking

The linker supports building programs that are loaded incrementally. This is used for example for Template Haskell. The process that runs the Template Haskell stays alive during compilation of a whole module. When the first Template Haskell expression is compiled, it is linked against all its dependencies (including the RTS) and the resulting JavaScript code is sent over to be run in the evaluator process.

As subsequent Template Haskell expressions are evaluated in the same process, there is no need to load already loaded dependencies (including the RTS) again and it is much more efficient to avoid doing so. Therefore the linker keeps track of which dependencies have already been linked and each subsequent TH expression is only linked against dependencies that are not already loaded in the evaluator process.

It's also possible for users to use this functionality directly, with the -generate-base to create a "linker state" file along with the regular jsexe files. Another program can then be linked with -use-base=state_file, resulting in a program which leaves out everything already present in the first program.

Future Improvements

Memory consumption is the biggest problem in the linker at the moment. Possible ways to achieve this are compression, more efficient representation of the data structures or more incremental loading of the parts from the object files that we need.

In terms of functionality, we don't take advantage of JavaScript modules yet. It would be good if we could improve the linker to support linking a library as a JavaScript module. We should also consider making use of foreign export javascript for this purpose.

· 11 min read
  1. GHC Primitives
    1. The Easy Cases
    2. ByteArray#, MutableByteArray#, SmallArray#, MutableSmallArray#,
    3. Addr# and StablePtr#
    4. Numbers: The Involved Case
      1. Working with 64-bit Types
      2. Unwrapped Number Optimization
    5. But what about the other stuff!

One of the key challenges in any novel backend is representing GHC primitive types in the new backend. For JavaScript, this is especially tricky, as JavaScript only has 8 primitive types and some of those types, such as number do not directly map to any Haskell primitive type, such as Int8#. This post walks through the most important GHC primitives and describes our implementation for each in the JavaScript backend. This post is intended to be an explanation-oriented post, light on details, but just enough to understand how the system works.

GHC Primitives

There are 36 primtypes that GHC defines in primops.txt.pp:

  1. Char#
  2. Int8#, Int16#, Int32#, Int64#, Int#
  3. Word8#, Word16#, Word32#, Word64#, Word#
  4. Double#, Float#,
  5. Array#, MutableArray#,, SmallArray#, SmallMutableArray#
  6. ByteArray#, MutableByteArray#
  7. Addr#
  8. MutVar#, TVar#, MVar#,
  9. IOPort#, State#, RealWorld, ThreadId#
  10. Weak#, StablePtr#, StableName#, Compact#, BCO,
  11. Fun, Proxy#
  12. StackSnapshot#
  13. VECTOR

Some of these are unsupported in the JS-backend, such as VECTOR or lower priority such as StackSnapshot#. Well begin with the easy cases.

The Easy Cases

The easy cases are the cases that are implemented as JavaScript objects. In general, this is the big hammer used when nothing else will do. Well expand on the use of objectsespecially representing heap objectsin a future post, but for the majority of cases we mimic the STG-machine behavior for GHC heap objects using JavaScript heap objects. For example,

var someConstructor =
{ f = // entry function of the datacon worker
, m = 0 // garbage collector mark
, d1 = first arg // First data field for the constructor
, d2 = arity = 2: second arg // second field, or object containing the remaining fields
arity > 2: { d1, d2, ...} object with remaining args (starts with "d1 = x2"!)
}

This is the general recipe; we define a JavaScript object that contains properties which correspond to the entry function of the heap object; in this case that is the entry function, f for a constructor, some meta data for garbage collection m, and pointers to the fields of the constructor or whatever else the heap object might need. Using JavaScript objects allows straightforward translations of several GHC types. For example TVars and MVars:

// stg.js.pp
/** @constructor */
function h$TVar(v) {
TRACE_STM("creating TVar, value: " + h$collectProps(v));
this.val = v; // current value
this.blocked = new h$Set(); // threads that get woken up if this TVar is updated
this.invariants = null; // invariants that use this TVar (h$Set)
this.m = 0; // gc mark
this._key = ++h$TVarN; // for storing in h$Map/h$Set
#ifdef GHCJS_DEBUG_ALLOC
h$debugAlloc_notifyAlloc(this);
#endif
}

// stm.js.pp
function h$MVar() {
TRACE_SCHEDULER("h$MVar constructor");
this.val = null;
this.readers = new h$Queue();
this.writers = new h$Queue();
this.waiters = null; // waiting for a value in the MVar with ReadMVar
this.m = 0; // gc mark
this.id = ++h$mvarId;
#ifdef GHCJS_DEBUG_ALLOC
h$debugAlloc_notifyAlloc(this);
#endif
}

Notice that both implementations defined properties specific to the semantics of the Haskell type. JavaScript functions which create these objects follow the naming convention h$<something> and reside in Shim files. Shim files are JavaScript files that the JS-backend links against and are written in pure JavaScript. This allows us to save some compile time by not generating code which doesnt change, and decompose the backend into JavaScript modules.

This strategy is also how functions are implemented in the JS-backend. Function objects are generated by StgToJS.Expr.genExpr and StgToJS.Apply.genApp but follow this recipe:

var myFUN =
{ f = <function itself>
, m = <garbage collector mark>
, d1 = free variable 1
, d2 = free variable 2
}

To summarize; for most cases we write custom JavaScript objects which hold whatever machinery is needed as properties to satisfy the expected semantics of the Haskell type. This is the strategy that implements: TVar, MVar, MutVar and Fun.

ByteArray#, MutableByteArray#, SmallArray#, MutableSmallArray#,

ByteArray# and friends map to JavaScript's ArrayBuffer object. The ArrayBuffer object provides a fixed-length, raw binary data buffer. To index into the ArrayBuffer we need to know the type of data the buffer is expected to hold. So we make engineering tradeoff; we allocate typed views of the buffer payload once at buffer allocation time. This prevents allocations from views later when we might be handling the buffer in a hot loop, at the cost of slower initialization. For example, consider the mem.js.pp shim, which defines ByteArray#:

// mem.js.pp
function h$newByteArray(len) {
var len0 = Math.max(h$roundUpToMultipleOf(len, 8), 8);
var buf = new ArrayBuffer(len0);
return { buf: buf
, len: len
, i3: new Int32Array(buf)
, u8: new Uint8Array(buf)
, u1: new Uint16Array(buf)
, f3: new Float32Array(buf)
, f6: new Float64Array(buf)
, dv: new DataView(buf)
, m: 0
}
}

buf is the payload of the ByteArray#, len is the length of the ByteArray#. i3 to dv are the views of the payload; each view is an object which interprets the raw data in buf differently according to type. For example, i3 interprets buf as holding Int32, while dv interprets buf as a DataView and so on. The final property, m, is the garbage collector marker.

Addr# and StablePtr#

Addr# and StablePtr# are implemented as a pair of ByteArray# and an Int# offset into the array. Well focus on Addr# because StablePtr# is the same implementation, with the exception that the StablePtr# is tracked in the global variable h$stablePtrBuf. Addr#s do not have an explicit constructor, rather they are implicitly constructed. For example, consider h$rts_mkPtr which creates a Ptr that contains an Addr#:

function h$rts_mkPtr(x) {
var buf, off = 0;
if(typeof x == 'string') {

buf = h$encodeUtf8(x);
off = 0;
} else if(typeof x == 'object' &&
typeof x.len == 'number' &&
x.buf instanceof ArrayBuffer) {

buf = x;
off = 0;
} else if(x.isView) {

buf = h$wrapBuffer(x.buffer, true, 0, x.buffer.byteLength);
off = x.byteOffset;
} else {

buf = h$wrapBuffer(x, true, 0, x.byteLength);
off = 0;
}
return (h$c2(h$baseZCGHCziPtrziPtr_con_e, (buf), (off)));
}

The function does some type inspection to check for the special case on string. If we do not have a string then a Ptr, which contains an Addr#, is returned. The Addr# is implicitly constructed by allocating a new ArrayBuffer and an offset into that buffer. The object case is an idempotent check; if the input is already such a Ptr, then just return the input. The cases which do the work are the cases which call to h$wrapBuffer:

// mem.js.pp
function h$wrapBuffer(buf, unalignedOk, offset, length) {
if(!unalignedOk && offset && offset % 8 !== 0) {
throw ("h$wrapBuffer: offset not aligned:" + offset);
}
if(!buf || !(buf instanceof ArrayBuffer))
throw "h$wrapBuffer: not an ArrayBuffer"
if(!offset) { offset = 0; }
if(!length || length < 0) { length = buf.byteLength - offset; }
return { buf: buf
, len: length
, i3: (offset%4) ? null : new Int32Array(buf, offset, length >> 2)
, u8: new Uint8Array(buf, offset, length)
, u1: (offset%2) ? null : new Uint16Array(buf, offset, length >> 1)
, f3: (offset%4) ? null : new Float32Array(buf, offset, length >> 2)
, f6: (offset%8) ? null : new Float64Array(buf, offset, length >> 3)
, dv: new DataView(buf, offset, length)
};
}

h$wrapBuffer is a utility function that does some offset checks and performs the allocation for the typed views as described above.

Numbers: The Involved Case

Translating numbers has three issues. First, JavaScript has no concept of fixed-precision 64-bit types such as Int64# and Word64#. Second, JavaScript bitwise operators only support signed 32-bit values (except the unsigned right shift operator of course). Third, numbers are atomic types and do not require any special properties for correct semantics, thus using wrapping objects gains us nothing at the cost of indirection.

Working with 64-bit Types

To express 64-bit numerics, we simply use two 32-bit numbers, one to express the high bits, one for the low bits. For example, consider comparing two Int64#:

// arith.js.pp
function h$hs_ltInt64(h1,l1,h2,l2) {
if(h1 === h2) {
var l1s = l1 >>> 1;
var l2s = l2 >>> 1;
return (l1s < l2s || (l1s === l2s && ((l1&1) < (l2&1)))) ? 1 : 0;
} else {
return (h1 < h2) ? 1 : 0;
}
}

The less than comparison function expects four inputs, two for each Int64# in Haskell. The first number is represented by h1 and l1 (high and low), and similarly the second number is represented by h2 and l2. The comparison is straightforward, we check equivalence of our high bits, if equal then we check the lower bits while being careful with signedness. No surprises here.

For the bitwise operators we store both Word32# and Word# as 32-bit signed values, and then map any values greater or equal 2^31 bits to negative values. This way we stay within the 32-bit range even though in Haskell these types only support nonnegative values.

Unwrapped Number Optimization

The JS backend uses JavaScript values to represent both Haskell heap objects and unboxed values (note that this isn't the only possible implementation, see 1). As such, it doesn't require that all heap objects have the same representation (e.g. a JavaScript object with a "tag" field indicating its type) because we can rely on JS introspection for the same purpose (especially typeof). Hence this optimization consists in using a more efficient JavaScript type to represent heap objects when possible, and to fallback on the generic representation otherwise.

This optimization particularly applies to Boxed numeric values (Int, Word, Int8, etc.) which can be directly represented with a JavaScript number, similarly to how unboxed Int#, Word#, Int8#, etc. values are represented.

Pros:

  • Fewer allocations and indirections: instead of one JavaScript object with a field containing a number value, we directly have the number value.

Cons:

  • More complex code to deal with heap objects that can have different representations

The optimization is applicable when:

  1. We have a single data type with a single data constructor.
  2. The constructor holds a single field that can only be a particular type.

If these invariants hold then, we remove the wrapping object and instead refer to the value held by the constructor directly. Int8 is the simplest case for this optimization. In Haskell we have:

data Int8 = Int8 Int8#

Notice that this definition satisfies the requirements. A direct translation in the JS backend would be:

// An Int8 Thunk represented as an Object with an entry function, f
// and payload, d1.
var anInt8 = { d1 = <Int8# payload>
, f : entry function which would scrutinize the payload
}

We can operationally distinguish between a Thunk and an Int8 because these will have separate types in the StgToJS GHC pass and will have separate types (object vs number) at runtime. In contrast, in Haskell an Int8 may actually be a Thunk until it is scrutinized and then becomes the Int8 payload (i.e., call-by-need). So this means that we will always know when we have an Int8 rather than a Thunk and therefore we can omit the wrapper object and convert this code to just:

// no object, just payload
var anInt8 = = <Int8# payload>

For the interested reader, this optimization takes place in the JavaScript code generator module GHC.StgToJS.Arg, specifically the functions allocConStatic, isUnboxableCon, and primRepVt.

But what about the other stuff!

  • Char#: is represented by a number, i.e., the code point
  • Float#/Double#: Both represented as a JavaScript Double. This means that Float# has excess precision and thus we do not generate exactly the same answers as other platforms which are IEEE754 compliant. Full emulation of single precision Floats does not seem to be worth the effort as of writing. Our implementation represents these in a ByteArray#, where each Float# takes 4 bytes in the ByteArray#. This means that the precision is reduced to a 32-bit Float.

  1. An alternative approach would be to use some JS ArrayBuffers as memory blocks into which Haskell values and heap objects would be allocated. As an example this is the approach used by the Asterius compiler. The RTS would then need to be much more similar to the C RTS and the optimization presented in this section wouldn't apply because we couldn't rely on introspection of JS values.

· 4 min read

Introduction

I recently gave a short presentation on the topic of threads in GHCJS to the GHC team at IOG. This blog post is a summary of the content.

JavaScript and Threads

JavaScript is fundamentally single threaded. There are ways to share specific data between tasks but it's not possible to run multiple threads that have access to a shared memory space of JavaScript data.

The single JavaScript thread is often responsible for multiple tasks. For example a node.js server handles multiple simultaneous connections and a web application may be dealing with user input while downloading new data in the background.

This means that any single task should take care to never block execution of the other task. JavaScript's canonical answer is to use asynchronous programming. A function reading a file returns immediately without waiting for the file data to be loaded in memory. When the data is ready, a user-supplied callback is called to continue processing the data.

Haskell Threads

Concurrent Haskell supports lightweight threads through forkIO. These threads are scheduled on top of one more more operating system thread. A blocking foreign call blocks an OS thread but other lightweight threads can still run on other OS threads if available.

There is no built-in support for foreign calls with a callback in the style of JavaScript. Functions imported with foreign import ccall interruptible can be interrupted by sending an asynchronous exception to the corresponding lightweight thread.

Lightweight Threads in JavaScript

GHCJS implements lightweight threads on top of the single JavaScript thread. The scheduler switches between threads and handles synchronization through MVar and STM as expected from other Haskell platforms.

Foreign calls that don't block can be handled in the usual way. We extend the foreign function interface with a new type foreign import javascript interruptible that conveniently supports the callback mechanism used by JavaScript frameworks. The foreign call is supplied with an additional argument $c representing a callback to be called with the result when ready. From the Haskell side the corresponding lightweight thread is blocked until $c is called. This type of foreign call can be interrupted with an asynchronous exception to the lightweight Haskell thread.

By default, Haskell threads in the JS environment run asynchronously. A call to h$run returns immediately and starts the thread in the background. This works for tasks that does not require immediate actions. For situations that require more immediate action, such as dealing with event handler propagation, there is h$runSync. This starts a synchronous thread that is not interleaved with other task. If possible, the thread runs to completion before the call to h$runSync returns. If the thread blocks for any reason, such as waiting for an MVar or a foreign import javascript interruptible call, synchronous execution cannot complete. The blocking task is then either interrupted with an exception or the thread is "demoted" to a regular asynchronous thread.

Black Holes

When a Haskell value is evaluated, its heap object is overwritten by a black hole. This black hole marks the value as being evaluated and prevents other threads from doing the same. "black holing" can be done either immediately or "lazily", when the garbage collector is run. GHCJS implements immediate blackholing.

Black holes give rise to an interesting problem in the presence of synchronous and asynchronous threads. Typically if we use h$runSync, we want to have some guarantee that at least part of the task will run succesfully without blocking. For the most past it's fairly clear which parts of our task depends on potentially blocking IO or thread synchronization. But black holes throw a spanner in the works: Suddenly any "pure" data structure can be a source of blocking if it is under evaluation by another thread.

To regain some predictability and usability of synchronous threads, the h$runSync scheduler can run other Haskell threads in order to "clear" a black hole. The process ends all black holes have been cleared or when any of the black holes is impossible to clear because of a blocking situation.

This all happens transparantly to the caller of h$runSync, if the black holes could be cleared it appears as if they were never there.

Conclusion

We have lightweight Haskell threads in the single-threaded JavaScript environment and extend the foreign function interface to easily support foreign calls that depend on an asynchronous callback. This way, only the Haskell lightweight thread blocks.

By default, Haskell threads are asynchronous and run in the background: The scheduler interleaves the tasks and synchronization between threads. For situations that require immediate results or actions there are synchronous threads. Synchronous threads cannot block and are not interleaved with other tasks except when a black hole is encountered.

· 2 min read

This is the June 2022 monthly update from the GHC DevX team at IOG.

JavaScript Backend for GHC

For a few months we have been merging GHCJS (Haskell to JavaScript compiler) into GHC. We set our first milestone to be the ability to compile and to run the usual "Hello World" program. It turned out to be much more involved than we initially thought (requiring FFI support, etc.), but we should be getting there soon.

This month we have made the following progress:

  • Linking: GHCJS requires some functions to be directly implemented in JavaScript (e.g. the RTS, some low-level functions in base). We have added support for linking .js files. We've also added support for a preprocessing pass with CPP for .js.pp files.

  • js-sources: there is some ongoing work to load these external JavaScript files from installed libraries. Cabal provides a js-sources stanza for this, we need to adapt Hadrian to make use of it.

  • Binary vs Objectable: GHCJS used its own ByteString-based Objectable type-class: we replaced it with GHC's similar Binary type-class. Josh has published a blog post about their differences.

  • 64-bit primops: we've added support for 64-bit primops (Word64# and Int64# types). In GHCJS (GHC 8.10), these were still implemented as foreign function calls. It's no longer true on GHC head.

  • base library: added CPP as required to support the JS backend. Ported and converted FFI imports from GHCJS to use JavaScript fat arrows (we haven't implemented GHCJS's fancy import syntax yet).

Now we can compile and link the "HelloWorld" program. To reach the first milestone we only have to fix the remaining runtime errors.

You can follow our progress on our development branch here. We now rebase this branch every Friday to avoid lagging too much behind GHC head.

Haskell Optimization Handbook

The "Haskell Optimization Handbook" is an accepted proposal of the Haskell Foundation. Jeff has been steadily writing some initial material as per the project plan.

· 3 min read

This is the May 2022 monthly update from the GHC DevX team at IOG.

JavaScript Backend for GHC

For a few months we have been merging GHCJS (Haskell to JavaScript compiler) into GHC. We set our first milestone to be the ability to compile and to run the usual "Hello World" program. It turned out to be much more involved than we initially thought (requiring FFI support, etc.), but we should be getting there soon.

This month we have made the following progress:

  • RTS: we have modified Hadrian and rts.cabal in order to build a valid native rts unit that GHC can use, in particular containing appropriate header files.

  • linker: the JS linker has been hooked up with GHC's driver. We fixed several panics in the linker due to erroneous symbol generation code. These bugs were introduced while porting the code from the old 8.10 pretty-printing infrastructure to the newer one.

  • boot libraries: the JS backend can now build and link all the boot libraries. Note that we are not claiming that they are all usable yet. In particular complete FFI support is lacking, but the JS backend Hadrian build completes and so we can start using the produced JS cross-compiler.

  • levity polymorphism: building ghc-prim uncovered a lurking bug related to levity polymorphism. It wasn't noticed in GHCJS 8.10 because it is also related to the BoxedRep proposal that introduced a constructor application in a commonly used RuntimeRep.

  • sized literals: support for new sized literals have been added to the code generator.

Now that have achieved a build process that actually produces a JS cross compiler, we are confronting and fixing issues in the produced JavaScript code, such as adding, managing, and debugging CPP conditional compilation blocks in JS shim files. You can follow our progress on our development branch here.

External Static Plugins

GHC doesn't support plugins in cross-compilers #14335. Some time ago, we came up with a solution called "external static plugins" !7377. These are plugins that are directly loaded from shared libaries, bypassing the issue with usual plugins.

Our colleague Shea Levy confirmed that the approach works, backported it to GHC 8.10, and has been working on making it work in stage1 cross-compilers for Windows. Kudos for this work, Shea.

As the current user-interface based on environment variables isn't convenient, we have been working on adding new command-line flags to GHC instead. We expect to propose this for integration into GHC when the new interface will be fully implemented.

Blog posts

Inspired by our friends and colleagues at Well-Typed and Tweag, we have been starting to write blog posts for IOG's engineering blog. They will mostly be about stuff we are working on or that we are interested in. Feel free to send us feedback about these posts and to send us topics you would be interested to read about.

Haskell Optimization Handbook

The "Haskell Optimization Handbook" is an accepted proposal of the Haskell Foundation. Jeff has been working behind the scene to make this proposal concrete. More about this in the upcoming months.

· 7 min read

Introduction

At IOG DevX we have been working on integrating various bits of GHCJS into GHC, with the goal of having a fully working JavaScript backend for the 9.6 release. For some parts this has mostly consisted of an update of the code to use the newer GHC API and dependencies. Other bits, like the Template Haskell runner, need more work.

This post gives an overview of the existing approaches for running Template Haskell in GHC based cross compilers and our plan for the JavaScript backend. Hopefully we can revisit this topic once all the work has been done, and see what exactly we ended up with.

The GHCJS Template Haskell Runner

When I first worked on Template Haskell (TH) support for GHCJS, there was no mechanism to combine Template Haskell with cross compilation in GHC.

Normally, Template Haskell is run by loading library code directly into the GHC process and using the bytecode interpreter for the current module. Template Haskell can directly access GHC data structures through the Q monad. Clearly this would not be possible for GHCJS: We only have JavaScript code available for the libraries and the organization of the JavaScript data structures is very different from what GHC uses internally.

So I had to look for an alternative. Running Template Haskell consists of two parts:

  1. loading/executing the TH code
  2. handling compiler queries from the TH code, for example looking up names or types

Running the TH code can be done by first compiling the Haskell to JavaScript and then using the JavaScript eval feature.

Template Haskell code can query the compiler using the Quasi typeclass. I noticed that none of the methods required passing around functions or complicated data structures, so it would be possible to serialize each request and response and send it to another process.

So I went ahead and implemented this approach with a script thrunner.js to load and start the code in a node.js server, a message type with serialization, and a new instance of the Quasi typeclass to handle the communication with the compiler via the messages. This is still what's in use by GHCJS to this day. Every time GHCJS encounters Template Haskell, it starts a thrunner process and the compiler communicates with it over a pipe.

After starting thrunner.js GHCJS sends the Haskell parts of the Template Haskell runnner to the script. This includes the runtime system and the implementation of the Quasi typeclass and communication protocol. After that, the TH session starts. A typical TH session looks as follows:

Compilerthrunner
RunTH THExp <js code> <source location>
LookupName (Just <name-string>)
LookupName' (Just <name>)
Reify <name>
Reify' <name-info>
RunTH' <result>
RunTH THDec <js code> <source location>
AddTopDecls <declarations>
AddTopDecls'
RunTH' <result>
FinishTH True
FinishTH' <memory-consumption>

Each message is followed up by a corresponding reply. For example, a LookupName' response follows a LookupName request and a RunTH message will eventually generate a RunTH' result. The first RunTH message contains the compiled JavaScript for the Template Haskell code, along with its dependencies. Each subsequent RunTH only includes dependencies that have not already been sent.

The thrunner process stays alive during the compilation of at least an entire module, allowing for persistent state (putQ/getQ).

The GHC External Interpreter

If we build a Haskell program with (cost centre) profiling, the layout of our data structures changes to include bookkeeping of cost centre information. This means that we need a special profiling runtime system to run this code.

What can we do if we want to run our profiled build in GHCi or Template Haskell? We cannot load compiled profiling libraries into GHC directly; its runtime system expects non-profiled code. We could use a profiled version of the compiler itself, but this would make all compilation very slow. Or we could somehow separate the profiled code of our own program from the non-profiled code in the compiler.

This was Simon Marlow's motivation for adapting the GHCJS thrunner approach, integrating in GHC and extending it it to support GHCi and bytecode. This functionality can be activated with the -fexternal-interpreter flag and has been available since GHC version 8.0.1. When the external interpreter is activated, GHC starts a separate process, iserv (customizable with the -pgmi flag) which has the role analogous to the thrunner script for GHCJS.

Over time, the iserv code has evolved with GHC and has been extended to include more operations. By now, there are quite a few differences in features:

Featurethrunneriserv
Template Haskell supportyesyes
GHCinoyes
Debuggernoyes
Bytecodenoyes
Object codethrough pipefrom file
Object code linkingcompileriserv process

thrunner is not quite as complete as iserv: It lacks GHCi and the debugger, and there is no bytecode support. But these features are not essential for basic Template Haskell.

Proxies and Bytecodes

We have now seen two systems for running Template Haskell code outside the compiler process: The original GHCJS thrunner and the extended GHC iserv.

Clearly it isn't ideal to have multiple "external interpreter" systems in GHC, therefore we plan to switch from thrunner to iserv for the upcoming JavaScript GHC backend. We don't need the debugger or GHCi support yet, but we do need to adapt to other changes in the infrastructure. So what does this mean in practice?

The biggest change is that we have to rework the linker: thrunner does not contain any linking logic by itself: GHCJS compiles everything to JavaScript and sends compiled code to the thrunner process, ready to be executed. In contrast, iserv has a loader for object and archive files. When dependencies need to be loaded into the interpreter, GHC just gives it the file name.

Another change is using the updated message types. In the thrunner session example above we could see that each message is paired with a response. For example a RunTH' response always follows a RunTH message, with possibly other messages in between. iserv has an interesting approach for the Message datatype: Instead of having pairs of data constructors for each message and its response, iserv has a GADT Message a, where the a type parameter indicates the expected response payload for each data constructor.

During development of the thrunner program it turned out to be very useful to save and replay Template Haskell sessions for debugging purposes. We'd like to do this again, but now saving the message in a readable/writable format. Since we're dealing with JavaScript, JSON appears to be the obvious choice.

Our plan is to have an iserv implementation that consists of a JavaScript part that runs in node.js and a proxy process to handle communication with GHC. The proxy process converts the messages between GHC's own (binary based) serialization format and JSON. The proxy process is relatively simple, but it does reveal one downside of the new GADT based message types: A proxy is stateful. We must always know which message we have sent to convert the response back from JSON to binary.

It's not yet known whether we will implement a full bytecode interpreter. We expect it to become clear during implementation whether we can get away without one early on.

Conclusion

We have seen how Template Haskell and GHCi code can be run outside the GHC process for profiling or cross compiling, with both the thrunner approach in GHCJS and the newer iserv in GHC.

We at IOG DevX are working on switching to the iserv infrastructure for the upcoming GHC JavaScript backend, which involves a substantial rewrite, mainly because of differences in linking. This is a work in progress, and we intend to revisit this topic in another blog post once the final design has been implemented.

· 2 min read

Welcome to the (rather late) April 2022 monthly update from the GHC DevX team at IOG. Since the last update we've continued work on the upcoming JavaScript backend for GHC. Unfortunately, we have nothing to show quite yet but that doesn't mean nothing has happened! On the contrary, we've made great progress and are close to that crucial first milestone hello world. Besides our work on the JavaScript backend, we were pleased to finally push through the Modularizing GHC paper that Sylvain has been working on for 2+ years! It causes quite the splash on the Haskell discourse and reddit, we recommend reading it if you haven't already (links below). Alright, enough introduction let's get into the update.

JavaScript Backend

We have made the following progresses in the implementation of a JavaScript backend for GHC (adapted from GHCJS):

  • linker: ported GHCJS's linker code into GHC. A lot of code was duplicated from GHC and slightly modified for GHCJS's needs, making the process far from trivial.

  • testsuite: fixed Hadrian to run GHC's testsuite with cross-compilers !7850. There are remaining issues though (see #21292).

  • build system: fixes for GHC's configure script were ported (e.g. support for the "ghcjs" target in config.sub). GHCJS's custom build script was integrated into configure.ac. We can now configure the build with: ./configure --target=js-unknown-ghcjs

  • TH: we have conducted some experiments to find the best way to bridge GHCJS's TH runner and GHC's external interpreter. This will be described in details in a future blog post.

  • FFI: basic support for JavaScript FFI has been ported from GHCJS to GHC. We haven't ported the JavaScript parser, so we have dropped the fancy import syntax (e.g. "$1.xyz"). It should be enough to build boot libraries and we will add JS parsing support later.

At this stage, we are working on building boot libraries and on supporting linking with the JS RTS.

Development happens in the following branch: https://gitlab.haskell.org/ghc/ghc/-/tree/wip/js-staging

Modularity paper

Sylvain, Jeffrey, and John Ericson (from Obsidian Systems) wrote a paper about "modularizing GHC" using domain-driven design.

We've got a lot of great feedback about it (expect a first revision soon). We also got a GHC contribution directly inspired by the paper (see !8160) which was very welcome!

· 18 min read

Table of Contents

Haskell is a great language camouflaged by lackluster tooling. This situation has led to well-known problems (who could forget Cabal hell?). A less discussed problem is what I will call the Black-box syndrome: It is hard to know exactly what the memory representation and runtime performance of my Haskell programs are1. Now black-box syndrome is not only a problem, it is also one of the nice features in the language since like all good abstractions it elides things Id rather not care about, at least most of the time. In other words, I am happy I dont have to do manual memory manipulation!

However, when I have my optimization hat on, I run face first into black-box syndrome. The crux of the problem is a tension between the need for observation during performance engineering and optimization, and the need to ship fast code. During development we want to be able to open up a system, see exactly how it is working, make tweaks, package it back up and test again. I want to be able to answer questions like Why is my executable this size?, Which code is a hot loop?, or When does my code do direct, known or unknown function calls?.

In order to answer these questions we need the ability to observe every part of that system as the machine experiences it, without this ability we have no way to make progress other than test, change some code, compile and test again in an ad-hoc manner. And therein lies the problem, most Haskell tooling is insufficient to provide the observability that we would like, instead the tooling often expects and requires us to make source code changes to our program or even recompile all of our libraries and code for a profiling way. This leads to the idea and the expectation in the Haskell community that Haskell programs are hard to optimize because the barrier to entry for optimization has artificially increased.

Csaba Hruska has recently been making headway in this area with his work on the GRIN compiler and an external STG interpreter. His STG interpreter (and patched ghc) exactly solve these problems and he has demonstrated dumping the entire call graph of large Haskell projects, filter to hot loops and finding unknown function calls in these graphs. If you havent seen his demo be sure to watch it, it is well worth your time.

This post is the first in a new blog series. In this blog series were going to kick the tires on the external STG interpreter see what it can do, and what we can uncover in some popular libraries by using it. In particular, Im interested in running it on projects Ive previously optimizedsuch as ghc itself, containers, unordered-containersusing the standard methods: ticky-ticky profiling, prof, flamegraphs, heap profiling, ghc-debug, cachegrind etc. This post, however, will be focused on setting up the patched ghc and interpreter on a NixOS system. My goals are threefold:

  1. Give an overview of the project and project layout to lower barrier to entry for the system.
  2. Give step by step instructions on setting up the interpreter on a nix-based system and provide a forked github repo for nix users. This should allow nix users to just git clone foo and nix-build (spoiler: it wont be that easy but still not hard.)
  3. Popularize Csabas project! It is a refreshing take on Haskell optimization and compilation.

Making sense of the project

The external STG interpreter is part of the GRIN compiler project. We are not doing anything with the GRIN compiler (yet!) and so we are only interested in The GHC whole compiler project. The whole-compiler-project has several sub-projects that well be building and using directly:

  • external-stg: This subproject provides utilites well be using, in particular mkfullpak
  • external-stg-interpreter: This is the actual STG interpreter. The good news is that this is independent of the rest of the project and can be built just like a regular Haskell executable
  • ghc-wpc: This is a fork of ghc-8.10.x (Im not sure exactly which version it forks to be honest) which we must build in order to use the external STG interpreter. Ghc-wpc serves as a frontend for the external-stg-interpreter.

Building a working external STG interpreter

The external STG interpreter can be built like any regular haskell executable. But in order to use the interpreter we have to build ghc-wpc. ghc-wpc is necessary because it serves as a frontend for the STG interpreter. It compiles a Haskell program like normal and then dumps an enriched STG IR to file. This file is then run through a utility gen-exe (gen-exe is an executable built in the external-stg-compiler sub-project) which picks up the compilation pipeline from the STG IR and creates an executable like we would expect from a normal compilation pipeline.

The major difference between this process and the usual compiler pipeline is that ghc-wpc leaves enough compiler information on disk for the rest of the tooling to consume, namely, in files with a *.o_stgbin (this is STG IR generated at compile time), and *.o_stgapp (project linker and dependency information) extension. Thus, once we build this custom ghc version we can use it to build the source code we wish to analyze and begin our optimization work.

For the rest of this tutorial Ill be referencing my fork of the ghc-whole-compiler-project that includes everything you need if you want to follow along, including .nix files for creating a nix-shell which will prepare a suitable environment to run the entire toolchain.

ghc.nix

The usual way to build ghc using a nix based system is with the ghc.nix project. Ghc.nix provides a default.nix with a suitable environment to run hadrian and build ghc. For ghc-wpc well need some special packages, and we need our boot compiler to be exactly ghc-8.3.3. The custom ghc.nix file is included in my fork, Ive taken the liberty to pin the nixpkgs to the right version for ghc-8.3.3. So lets begin:

Clone the forked repo:

$ git clone https://github.com/doyougnu/ghc-whole-program-compiler-project.git

$ cd ghc-whole-program-compiler-project

$ tree -L 1
.
├── dist-newstyle
├── external-stg
├── external-stg-compiler
├── external-stg-interpreter
├── ghc.nix.wpc
├── ghc-wpc
├── lambda
├── mod-pak
├── README.md
├── shell.nix
├── stack.yaml
└── stack.yaml.lock

Youll find the patched ghc.nix included (ghc.nix.wpc) and a shell.nix for a nix-shell. The shell.nix file simply references ghc.nix.wpc/default.nix with the appropriate options:

$ cat shell.nix
import (./ghc.nix.wpc/default.nix) {
useClang = true;
withHadrianDeps = true;
withIde = false;
withLlvm = true;
}

Building ghc-wpc

Now we can enter a nix-shell and build ghc-wpc:

$ pwd
/home/doyougnu/programming/haskell/ghc-whole-program-compiler-project

$ nix-shell shell.nix # or just nix-shell
trace: checking if /home/doyougnu/programming/haskell/ghc-whole-program-compiler-project/hadrian/hadrian.cabal is present: no
Recommended ./configure arguments (found in $CONFIGURE_ARGS:
or use the configure_ghc command):

--with-gmp-includes=/nix/store/sznfxigwvrvn6ar3nz3f0652zsld9xqj-gmp-6.2.0-dev/include
--with-gmp-libraries=/nix/store/447im4mh8gmw85dkrvz3facg1jsbn6c7-gmp-6.2.0/lib
--with-curses-includes=/nix/store/84g84bg47xxg01ba3nv0h418v5v3969n-ncurses-6.1-20190112-dev/include
--with-curses-libraries=/nix/store/xhhkr936b9q5sz88jp4l29wljbbcg39k-ncurses-6.1-20190112/lib
--with-libnuma-includes=/nix/store/bfrcskjspk9a179xqqf1q9xqafq5s8d2-numactl-2.0.13/include
--with-libnuma-libraries=/nix/store/bfrcskjspk9a179xqqf1q9xqafq5s8d2-numactl-2.0.13/lib
--with-libdw-includes=/nix/store/sv6f05ngaarba50ybr6fdfc7cciv6nbv-elfutils-0.176/include
--with-libdw-libraries=/nix/store/sv6f05ngaarba50ybr6fdfc7cciv6nbv-elfutils-0.176/lib
--enable-dwarf-unwind

[nix-shell:~/programming/haskell/ghc-whole-program-compiler-project]$

Now we need to cd into ghc-wpc and tweak the hadrian build.

MAJOR CONSTRAINT: You must build ghc-wpc with hadrian/build-stack, if you build in any other way youll run into shared object errors, see this ticket for details.

So in order to build ghc-wpc with stack well have to tweak the stack.yaml file. You must do this since it is not included in the fork:

Quick side note: To make the formatting nicer I truncate nix-shell:~/foo/bar/baz/ghc-whole-program-compiler-project to just ..., so nix-shell:.../ghc-wpc is equivalent to ~/path/to/ghc-whole-compiler-project/ghc-wpc.

[nix-shell:...]$ cd ghc-wpc/hadrian/

[nix-shell:.../ghc-wpc/hadrian]$ cat stack.yaml
resolver: lts-15.5

packages:
- '.'
- 'GHC-Cabal'

system-ghc: true

nix:
enable: true
shell-file: ../../shell.nix

The changes are: (1) tell stack we are using nix, and (2) reference the shell.nix file which points to ghc.wpc.nix at the root of the project, i.e., ghc-whole-program-compiler-project/shell.nix.

Now we should be able to begin our build, return to the root of ghc-wpc and run the following:

[nix-shell:.../ghc-wpc/hadrian]$ cd ..

[nix-shell:.../ghc-wpc]$ ./boot && ./configure

[nix-shell:.../ghc-wpc]$ hadrian/build-stack -j

and go get some coffee since this will take some time. Once it finishes you should have the ghc-wpc binary in _build/stage1/bin

[nix-shell:.../ghc-wpc]$ ls -l _build/stage1/bin/
total 8592
-rwxr-xr-x 1 doyougnu users 1843752 Apr 29 23:01 ghc
-rw-r--r-- 1 doyougnu users 11082 Apr 29 23:01 ghc.dyn_o_ghc_stgapp
-rwxr-xr-x 1 doyougnu users 660128 Apr 29 22:50 ghc-pkg
-rw-r--r-- 1 doyougnu users 9977 Apr 29 22:50 ghc-pkg.dyn_o_ghc_stgapp
-rwxr-xr-x 1 doyougnu users 4624680 Apr 29 23:01 haddock
-rw-r--r-- 1 doyougnu users 16883 Apr 29 23:01 haddock.dyn_o_ghc_stgapp
-rwxr-xr-x 1 doyougnu users 49344 Apr 29 22:25 hp2ps
-rw-r--r-- 1 doyougnu users 2504 Apr 29 22:25 hp2ps.dyn_o_ghc_stgapp
-rwxr-xr-x 1 doyougnu users 716440 Apr 29 22:35 hpc
-rw-r--r-- 1 doyougnu users 9959 Apr 29 22:35 hpc.dyn_o_ghc_stgapp
-rwxr-xr-x 1 doyougnu users 738544 Apr 29 22:35 hsc2hs
-rw-r--r-- 1 doyougnu users 10264 Apr 29 22:35 hsc2hs.dyn_o_ghc_stgapp
-rwxr-xr-x 1 doyougnu users 58384 Apr 29 22:34 runghc
-rw-r--r-- 1 doyougnu users 8864 Apr 29 22:34 runghc.dyn_o_ghc_stgapp

Notice that this build dumped *.<way>_o_ghc_stgapp files!

Building the stg tooling

Now that we have a working ghc-wpc we need to build the rest of the project by pointing stack to the ghc-wpc binary in ghc-wpc/_build/stage1/bin. That is, we must change the ghc-whole-program-compiler-project/stack.yaml file:

[nix-shell:~/programming/haskell/ghc-whole-program-compiler-project]$ cat stack.yaml
resolver: lts-16.13

allow-newer: true

packages:
- 'external-stg-compiler'
- 'external-stg'

ghc-options:
"$everything": -fno-stgbin -fno-stgapp -optcxx-std=c++17

extra-deps:
- async-pool-0.9.1@sha256:4015140f896c3f1652b06a679b0ade2717d05557970c283ea2c372a71be2a6a1,1605
- souffle-haskell-1.1.0
- zip-1.7.0


# use custom ext-stg whole program compiler GHC
compiler: ghc-8.11.0
skip-ghc-check: true

nix:
enable: false


# use local GHC (for development)
system-ghc: true
extra-path:
- /home/doyougnu/programming/haskell/ghc-whole-program-compiler-project/ghc-wpc/_build/stage1/bin

# DEBUG INFO
#dump-logs: all
#build:
# keep-tmp-files: true
# cabal-verbose: true

The changes are: (1) set compiler: ghc-8.11.0 (the ghc-wpc fork), (2) set skip-ghc-check: true so that stack doesnt complain about the ghc version, (3) set nix.enable: false, confusingly if you leave this as true then stack will try to use nixpkgs to get a ghc binary, but we want it to use our local binary so we disable this even though well still be in our original nix-shell (4) set system-path: true to tell stack we will be using a ghc we have on our system, and finally (5) set extra-path: <path-to-ghc-wpc-binary>.

Now we can run stack and install the stg tooling:

[nix-shell:...]$ stack --stack-root `pwd`/.stack-root install
Trouble loading CompilerPaths cache: UnliftIO.Exception.throwString called with:

Compiler file metadata mismatch, ignoring cache
Called from:
throwString (src/Stack/Storage/User.hs:277:8 in stack-2.7.5-9Yv1tjrmAU3JiZWCo86ldN:Stack.Storage.User)

WARNING: Ignoring tagged's bounds on template-haskell (>=2.8 && <2.17); using template-haskell-2.17.0.0.
Reason: allow-newer enabled.
WARNING: Ignoring aeson's bounds on template-haskell (>=2.9.0.0 && <2.17); using template-haskell-2.17.0.0.
Reason: allow-newer enabled.
WARNING: Ignoring th-abstraction's bounds on template-haskell (>=2.5 && <2.17); using template-haskell-2.17.0.0.
Reason: allow-newer enabled.
WARNING: Ignoring unliftio-core's bounds on base (>=4.5 && <4.14); using base-4.14.0.0.
Reason: allow-newer enabled.
WARNING: Ignoring souffle-haskell's bounds on megaparsec (>=7.0.5 && <8); using megaparsec-8.0.0.
stack --stack-root `pwd`/.stack-root install
... # bunch of output
...
...
Copied executables to /home/doyougnu/.local/bin:
- dce-fullpak
- ext-stg
- fullpak
- gen-exe
- gen-exe2
- gen-obj
- gen-obj2
- mkfullpak
- show-ghc-stg

Warning: Installation path /home/doyougnu/.local/bin not found on the PATH environment variable.

You can add ~/.local/bin to your PATH if you want, Ill just be directly referencing these binaries as we go.

Building the external-stg-interpreter

We are almost all done, all that is left is to build the external-stg-interpreter and run a small script that links everything together into a shared object for the interpreter. So:

[nix-shell:...]$ cd external-stg-interpreter/

[nix-shell:.../external-stg-interpreter]$ stack install
... # bunch of output
...
Copied executables to /home/doyougnu/.local/bin:
- ext-stg
- ext-stg-interpreter
- fullpak
- mkfullpak

Warning: Installation path /home/doyougnu/.local/bin not found on the PATH environment variable.

Now we have our ext-stg-interpreter built! There are a few caveats I want to point out here. Ive modified ghc-whole-program-compiler-project/external-stg-interpreter/stack.yaml to load the right packages and use nix:

[nix-shell:.../external-stg-interpreter]$ cat stack.yaml
resolver: lts-16.13

packages:
- '.'
- 'external-stg'

extra-deps:
- souffle-haskell-2.1.0
- primitive-0.7.1.0
- zip-1.7.0

nix:
enable: true
packages: [ zlib, libffi, pkg-config, bzip2 ]

Notice the nix: block. We could have just as easily built this using nix directly or using our shell.nix file.

Linking the external-stg-interpreter

The only task left is to link into a shared object library called libHSbase-4.14.0.0.cbits.so. To do that we need to use the script called, c, in ghc-whole-program-compiler-project/external-stg-interpreter/data. This script is a bit of a hack, it generates the shared object file so that we can link the symbols requested by the C FFI in base, but it populates those functions with our replacements, which do absolutely nothing. For example, we supply a fake garbage collect:

// in .../external-stg-interpreter/data/cbits.so-script/c-src/fake_rts.c
...
void performGC(void) {
}

void performMajorGC(void) {
}
...

This works because we won't be using the runtime system at all, we'll be using the external STG interpreter instead, however we still need to provide these symbols in order to link. **MAJOR NOTE: this file must be next to any *.fullpak file youll be running the interpreter on** or else youll get an undefined symbol error during linking, for example:

[nix-shell:.../external-stg-interpreter/data]$ ls
cbits.so-script ghc-rts-base.fullpak minigame-strict.fullpak

### notice no .so file
[nix-shell:.../external-stg-interpreter/data]$ ~/.local/bin/ext-stg-interpreter ghc-rts-base.fullpak
ext-stg-interpreter: user error (dlopen: ./libHSbase-4.14.0.0.cbits.so: cannot open shared object file: No such file or directory)

## we error'd out because it was missing, also
## if you get this error then you have an old cbits.so file and need to rerun the c script
[nix-shell:.../external-stg-interpreter/data]$ ~/.local/bin/ext-stg-interpreter ghc-rts-base.fullpak
ext-stg-interpreter: user error (dlopen: ./libHSbase-4.14.0.0.cbits.so: undefined symbol: getProcessElapsedTime)

To link the interpreter we need to run c in the data/cbits.so-script sub-folder:

[nix-shell:.../external-stg-interpreter]$ cd data/cbits.so-script/

[nix-shell:.../external-stg-interpreter/data/cbits.so-script]$ ls
ar c cbits-rts.dyn_o c-src libHSbase-4.14.0.0.cbits.so stub-base.dyn_o

[nix-shell:.../external-stg-interpreter/data/cbits.so-script]$ ./c
++ ls ar/libHSbase-4.14.0.0-ghc8.11.0.20210220.dyn_o_cbits.a ar/libHSbindings-GLFW-3.3.2.0-Jg9TvsfYUZwD0ViIP0H2Tz-ghc8.11.0.20210306.dyn_o_cbits.a ar/libHSbytestring-0.10.9.0-ghc8.11.0.20210306.dyn_o_cbits.a ar/libHScriterion-measurement-0.1.2.0-73BCI2Fnk7qE8QjjTa1xNa-ghc8.11.0.20210324.dyn_o_cbits.a ar/libHSghc-8.11.0.20210306-ghc8.11.0.20210306.dyn_o_cbits.a ar/libHSGLUT-2.7.0.15-1pzTWDEZBcYHcS36qZ2lpp-ghc8.11.0.20201112.dyn_o_cbits.a ar/libHSGLUT-2.7.0.15-1pzTWDEZBcYHcS36qZ2lpp-ghc8.11.0.20210324.dyn_o_stubs.a ar/libHShashable-1.3.0.0-Kn7aNSFvzgo2qY16wYzuCX-ghc8.11.0.20210306.dyn_o_cbits.a ar/libHSinteger-gmp-1.0.3.0-ghc8.11.0.20210220.dyn_o_cbits.a ar/libHSlambdacube-quake3-engine-0.1.0.0-7CKLP3Rqgq0PR81lhlwlR-ghc8.11.0.20210306.dyn_o_cbits.a ar/libHSmersenne-random-pure64-0.2.2.0-ExYg8DmthtrLG9JevQbt2m-ghc8.11.0.20210306.dyn_o_cbits.a ar/libHSOpenGLRaw-3.3.4.0-5vXBlmbOM3AIT7GRYfpE3o-ghc8.11.0.20201112.dyn_o_cbits.a ar/libHSprimitive-0.7.0.1-2k3g9qX0zz16vEv34R307m-ghc8.11.0.20210306.dyn_o_cbits.a ar/libHSprocess-1.6.8.2-ghc8.11.0.20210220.dyn_o_cbits.a ar/libHStext-1.2.4.0-ghc8.11.0.20210220.dyn_o_cbits.a ar/libHSunix-2.7.2.2-ghc8.11.0.20210220.dyn_o_cbits.a ar/libHSunix-2.7.2.2-ghc8.11.0.20210220.dyn_o_stubs.a ar/libHSzlib-0.6.2.1-1I6DmfbLEyTBgDZI7SbZfW-ghc8.11.0.20210306.dyn_o_stubs.a
++ ls stub-base.dyn_o/Blank_stub.dyn_o stub-base.dyn_o/ClockGetTime_stub.dyn_o stub-base.dyn_o/Internals_stub.dyn_o stub-base.dyn_o/RUsage_stub.dyn_o
++ ls cbits-rts.dyn_o/StgPrimFloat.dyn_o cbits-rts.dyn_o/TTY.dyn_o
++ ls c-src/fake_rts.c c-src/hack.c c-src/hschooks.c
+ gcc -o libHSbase-4.14.0.0.cbits.so -shared -Wl,--whole-archive ar/libHSbase-4.14.0.0-ghc8.11.0.20210220.dyn_o_cbits.a ar/libHSbindings-GLFW-3.3.2.0-Jg9TvsfYUZwD0ViIP0H2Tz-ghc8.11.0.20210306.dyn_o_cbits.a ar/libHSbytestring-0.10.9.0-ghc8.11.0.20210306.dyn_o_cbits.a ar/libHScriterion-measurement-0.1.2.0-73BCI2Fnk7qE8QjjTa1xNa-ghc8.11.0.20210324.dyn_o_cbits.a ar/libHSghc-8.11.0.20210306-ghc8.11.0.20210306.dyn_o_cbits.a ar/libHSGLUT-2.7.0.15-1pzTWDEZBcYHcS36qZ2lpp-ghc8.11.0.20201112.dyn_o_cbits.a ar/libHSGLUT-2.7.0.15-1pzTWDEZBcYHcS36qZ2lpp-ghc8.11.0.20210324.dyn_o_stubs.a ar/libHShashable-1.3.0.0-Kn7aNSFvzgo2qY16wYzuCX-ghc8.11.0.20210306.dyn_o_cbits.a ar/libHSinteger-gmp-1.0.3.0-ghc8.11.0.20210220.dyn_o_cbits.a ar/libHSlambdacube-quake3-engine-0.1.0.0-7CKLP3Rqgq0PR81lhlwlR-ghc8.11.0.20210306.dyn_o_cbits.a ar/libHSmersenne-random-pure64-0.2.2.0-ExYg8DmthtrLG9JevQbt2m-ghc8.11.0.20210306.dyn_o_cbits.a ar/libHSOpenGLRaw-3.3.4.0-5vXBlmbOM3AIT7GRYfpE3o-ghc8.11.0.20201112.dyn_o_cbits.a ar/libHSprimitive-0.7.0.1-2k3g9qX0zz16vEv34R307m-ghc8.11.0.20210306.dyn_o_cbits.a ar/libHSprocess-1.6.8.2-ghc8.11.0.20210220.dyn_o_cbits.a ar/libHStext-1.2.4.0-ghc8.11.0.20210220.dyn_o_cbits.a ar/libHSunix-2.7.2.2-ghc8.11.0.20210220.dyn_o_cbits.a ar/libHSunix-2.7.2.2-ghc8.11.0.20210220.dyn_o_stubs.a ar/libHSzlib-0.6.2.1-1I6DmfbLEyTBgDZI7SbZfW-ghc8.11.0.20210306.dyn_o_stubs.a -Wl,--no-whole-archive stub-base.dyn_o/Blank_stub.dyn_o stub-base.dyn_o/ClockGetTime_stub.dyn_o stub-base.dyn_o/Internals_stub.dyn_o stub-base.dyn_o/RUsage_stub.dyn_o cbits-rts.dyn_o/StgPrimFloat.dyn_o cbits-rts.dyn_o/TTY.dyn_o -fPIC c-src/fake_rts.c c-src/hack.c c-src/hschooks.c -lm -lgmp -ltinfo -lGL -lX11 -lXi -lXrandr -lXxf86vm -lXcursor -lXinerama -lpthread

This will produce libHSbase-4.14.0.0.cbits.so in the immediate directory:

[nix-shell:.../external-stg-interpreter/data/cbits.so-script]$ ls -l
total 984
drwxr-xr-x 2 doyougnu users 4096 Apr 27 14:10 ar
-rwxr-xr-x 1 doyougnu users 300 Apr 27 14:10 c
drwxr-xr-x 2 doyougnu users 4096 Apr 27 14:10 cbits-rts.dyn_o
drwxr-xr-x 2 doyougnu users 4096 Apr 27 14:10 c-src
-rwxr-xr-x 1 doyougnu users 986008 Apr 30 11:50 libHSbase-4.14.0.0.cbits.so ## <----- new
drwxr-xr-x 2 doyougnu users 4096 Apr 27 14:10 stub-base.dyn_o

Now we can test our interpreter by running it on the *.fullpak files in external-stg-interpreter/data:

[nix-shell:.../external-stg-interpreter/data/cbits.so-script]$ cd ..

[nix-shell:.../external-stg-interpreter/data]$ ls
cbits.so-script ghc-rts-base-call-graph-summary ghc-rts-base-call-graph.tsv ghc-rts-base.fullpak libHSbase-4.14.0.0.cbits.so minigame-strict.fullpak

## remove the old .so file
[nix-shell:.../external-stg-interpreter/data]$ rm libHSbase-4.14.0.0.cbits.so

## soft-link to the one we just built
[nix-shell:.../external-stg-interpreter/data]$ ln -s cbits.so-script/libHSbase-4.14.0.0.cbits.so libHSbase-4.14.0.0.cbits.so

[nix-shell:.../external-stg-interpreter/data]$ ls -l
total 79220
drwxr-xr-x 6 doyougnu users 4096 Apr 30 11:50 cbits.so-script
-rw-r--r-- 1 doyougnu users 48 Apr 30 11:47 ghc-rts-base-call-graph-summary
-rw-r--r-- 1 doyougnu users 28238 Apr 30 11:47 ghc-rts-base-call-graph.tsv
-rw-r--r-- 1 doyougnu users 22450708 Apr 27 14:10 ghc-rts-base.fullpak
lrwxrwxrwx 1 doyougnu users 43 Apr 30 11:55 libHSbase-4.14.0.0.cbits.so -> cbits.so-script/libHSbase-4.14.0.0.cbits.so ### <---- new
-rw-r--r-- 1 doyougnu users 58630129 Apr 27 14:10 minigame-strict.fullpak

[nix-shell:.../external-stg-interpreter/data]$ ~/.local/bin/ext-stg-interpreter ghc-rts-base.fullpak
hello
hello
ssHeapStartAddress: 53522
ssTotalLNECount: 69
ssClosureCallCounter: 360
executed closure id count: 114
call graph size: 150

[nix-shell:.../external-stg-interpreter/data]$ ls -l
total 79220
drwxr-xr-x 6 doyougnu users 4096 Apr 30 11:50 cbits.so-script
-rw-r--r-- 1 doyougnu users 48 Apr 30 11:56 ghc-rts-base-call-graph-summary ### <---- interpreter output
-rw-r--r-- 1 doyougnu users 28238 Apr 30 11:56 ghc-rts-base-call-graph.tsv ### <---- interpreter output
-rw-r--r-- 1 doyougnu users 22450708 Apr 27 14:10 ghc-rts-base.fullpak
lrwxrwxrwx 1 doyougnu users 43 Apr 30 11:55 libHSbase-4.14.0.0.cbits.so -> cbits.so-script/libHSbase-4.14.0.0.cbits.so
-rw-r--r-- 1 doyougnu users 58630129 Apr 27 14:10 minigame-strict.fullpak

And it works, we have two new files, <foo>-call-graph-summary and <foo>-call-graph.tsv which we can analyze to inspect the behavior of our program (more on this later).

The whole setup process on a demo

That was a rather involved example, to make clear the dependencies and steps required to run this on your own code the rest of this tutorial will run the interpreter on two of Csabas demos from his skillshare talk. First lets grab the code:

$ pwd
/home/doyougnu/programming/haskell

$ git clone https://github.com/grin-compiler/ext-stg-interpreter-presentation-demos.git

$ ls
ext-stg-interpreter-presentation-demos ghc-whole-program-compiler-project ..

Now well run the first demo which is a simply fold over a list:

$ nix-shell ghc-whole-program-compiler-project/shell.nix
trace: checking if /home/doyougnu/programming/haskell/hadrian/hadrian.cabal is present: no
Recommended ./configure arguments (found in $CONFIGURE_ARGS:
or use the configure_ghc command):

--with-gmp-includes=/nix/store/sznfxigwvrvn6ar3nz3f0652zsld9xqj-gmp-6.2.0-dev/include
--with-gmp-libraries=/nix/store/447im4mh8gmw85dkrvz3facg1jsbn6c7-gmp-6.2.0/lib
--with-curses-includes=/nix/store/84g84bg47xxg01ba3nv0h418v5v3969n-ncurses-6.1-20190112-dev/include
--with-curses-libraries=/nix/store/xhhkr936b9q5sz88jp4l29wljbbcg39k-ncurses-6.1-20190112/lib
--with-libnuma-includes=/nix/store/bfrcskjspk9a179xqqf1q9xqafq5s8d2-numactl-2.0.13/include
--with-libnuma-libraries=/nix/store/bfrcskjspk9a179xqqf1q9xqafq5s8d2-numactl-2.0.13/lib
--with-libdw-includes=/nix/store/sv6f05ngaarba50ybr6fdfc7cciv6nbv-elfutils-0.176/include
--with-libdw-libraries=/nix/store/sv6f05ngaarba50ybr6fdfc7cciv6nbv-elfutils-0.176/lib
--enable-dwarf-unwind

[nix-shell:~/programming/haskell]$ cd ext-stg-interpreter-presentation-demos/demo-01-tsumupto/

[nix-shell:~/programming/haskell/ext-stg-interpreter-presentation-demos/demo-01-tsumupto]$ ../../ghc-whole-program-compiler-project/ghc-wpc/_build/stage1/bin/ghc -O2 tsumupto.hs
[1 of 1] Compiling Main ( tsumupto.hs, tsumupto.o )
Linking tsumupto ...
$ cd ext-stg-interpreter-presentation-demos/demo-01-tsumupto

$ ls
tsumupto tsumupto.hi tsumupto.hs tsumupto.o tsumupto.o_ghc_stgapp tsumupto.o_modpak

Note, that we have two new files: *.o_ghc_stgapp and .o_modpak as a result of building with ghc-wpc. If you try to run this from outside the nix-shell youll get an error about missing mkmodpak:

$ ../../ghc-whole-program-compiler-project/ghc-wpc/_build/stage1/bin/ghc -O2 tsumupto.hs
[1 of 1] Compiling Main ( tsumupto.hs, tsumupto.o )
ghc: could not execute: mkmodpak

Now that we have those files we can run the interpreter, but first though we need to make a *.fullpak file from the *.o_ghc_stgapp file and create a symbolic link to libHSbase-4.14.0.0.cbits.so:

## make the fullpack file
$ ~/.local/bin/mkfullpak tsumupto.o_ghc_stgapp
all modules: 259
app modules: 113
app dependencies:
... # bunch of output
...
main Main
creating tsumupto.fullpak

## create the link to the shared object file
$ ln -s ../../ghc-whole-program-compiler-project/external-stg-interpreter/data/cbits.so-script/libHSbase-4.14.0.0.cbits.so libHSbase-4.14.0.0.cbits.so

## the final directory should look like this
$ ls
libHSbase-4.14.0.0.cbits.so tsumupto tsumupto.fullpak tsumupto.hi tsumupto.hs tsumupto.o tsumupto.o_ghc_stgapp tsumupto.o_modpak

And now we can run the interpreter:

$ ~/.local/bin/ext-stg-interpreter tsumupto.fullpak
50005000
ssHeapStartAddress: 44082
ssTotalLNECount: 43
ssClosureCallCounter: 30275
executed closure id count: 112
call graph size: 146

The first line is the output of the program and the rest are diagnostics that the interpreter outputs. More importantly we should have a tab-separated csv file and call graph file in our local directory after running the interpreter:

$ ls -l
total 23876
lrwxrwxrwx 1 doyougnu users 114 Apr 30 12:21 libHSbase-4.14.0.0.cbits.so -> ../../ghc-whole-program-compiler-project/external-stg-interpreter/data/cbits.so-script/libHSbase-4.14.0.0.cbits.so
-rwxr-xr-x 1 doyougnu users 9442648 Apr 30 12:12 tsumupto
-rw-r--r-- 1 doyougnu users 53 Apr 30 12:23 tsumupto-call-graph-summary ### <---- interpreter output
-rw-r--r-- 1 doyougnu users 27490 Apr 30 12:23 tsumupto-call-graph.tsv ### <---- interpreter output
-rw------- 1 doyougnu users 14922366 Apr 30 12:19 tsumupto.fullpak
-rw-r--r-- 1 doyougnu users 1769 Apr 30 12:12 tsumupto.hi
-rw-r--r-- 1 doyougnu users 207 Apr 28 22:56 tsumupto.hs
-rw-r--r-- 1 doyougnu users 4488 Apr 30 12:12 tsumupto.o
-rw-r--r-- 1 doyougnu users 8817 Apr 30 12:12 tsumupto.o_ghc_stgapp
-rw------- 1 doyougnu users 9803 Apr 30 12:12 tsumupto.o_modpak

Which can be loaded into gephi for closer inspection of the call graph of our program. Be sure to watch the rest of the demo in Csabas talk for this part! For now well be going over using gephi and these files in our next blog post in this series, stay tuned!

Summary

File Descriptions

  • foo.modpak: A zip file which contains the Core, STG, CMM, source code, and assembly for the module foo
  • foo.fullpak: A zip file which contains the same information as modpack but for every module of the program rather than just module foo.
  • foo.o_ghc_stgapp: a yaml like file that contains:
    • the modules dependencies including package dependencies
    • a bunch of file paths for shared objects of the libraries
    • the flags the module was built with
  • libHSbase-4.14.0.0.cbits.so: shared object file created by ext-stg-interpreter/data/cbits.so-script.c. Required to be in the same directory as ext-stg-interpreter will be invoked.

Step-by-Step guide for running the interpreter on your code

  1. Build your project with ghc-wpc/_build/stage1/bin by directly invoking that ghc (as I did in the demo-01 project) or by pointing stack to it with system-ghc and extra-path in stack.yaml, or by passing -w <path-to-ghc-wpc-binary with cabal.
  2. Generate the foo.fullpak file with mkfullpak foo.o_ghc_stgapp
  3. Soft-link to libHSbase-4.14.0.0.cbits.so in the directory you will run the interpreter in. This file must be present when you run the interpreter!
  4. Now run the interpreter on project.fullpak
  5. Analyze foo-call-graph-summary and foo-call-graph.tsv with whatever tools make sense to you

Footnotes


  1. This isnt completely true, there is the RuntimeRep type controls exactly this and the levity polymorphism work by Richard Eisenberg. See this video for examples on using these features. We do plan to include a more thorough and real world example on using levity polymorphism for better performance in the haskell optimization handbook.

· 5 min read

In this post I discuss the inlining of Integer and Natural operations in Haskell. It’s a promising performance work I’ve been conducting six months ago, which was blocked by an independent issue, but that I will likely resume soon as the issue has been fixed in the meantime.


To follow this post, you must know that Natural numbers are represented as follows in ghc-bignum:

-- | Natural number
--
-- Invariant: numbers <= WORD_MAXBOUND use the `NS` constructor
data Natural
= NS !Word#
| NB !BigNat#

Small naturals are represented with a Word# and large ones with a BigNat# (a ByteArray#).

Now consider the following simple example using Natural:

-- | Add 2 to a Word. Use Natural to avoid Word overflow
foo :: Word -> Natural
foo x = fromIntegral x + 2

There are only small naturals involved: fromIntegral x is small because x is a Word, and 2 is small. We could hope that GHC would use Word# primops to implement this and would allocate a Natural heap object for the result only. However it’s not what happens currently, even in GHC HEAD. In the following STG dump, we can see that a Natural heap object is allocated for x before calling naturalAdd (let bindings in STG reflect heap allocations):

foo1 = NS! [2##];

foo =
\r [x_sXn]
case x_sXn of {
W# x#_sXp ->
let { sat_sXq = NS! [x#_sXp]; } in naturalAdd sat_sXq foo1;
};

Let’s look at naturalAdd:

-- | Add two naturals
naturalAdd :: Natural -> Natural -> Natural
{-# NOINLINE naturalAdd #-}
naturalAdd (NS x) (NB y) = NB (bigNatAddWord# y x)
naturalAdd (NB x) (NS y) = NB (bigNatAddWord# x y)
naturalAdd (NB x) (NB y) = NB (bigNatAdd x y)
naturalAdd (NS x) (NS y) =
case addWordC# x y of
(# l,0# #) -> NS l
(# l,c #) -> NB (bigNatFromWord2# (int2Word# c) l)

We are clearly in the last case where both arguments are small. It seems beneficial to allow this function to be inlined. If we did we would get:

foo =
\r [x_s158]
case x_s158 of {
W# x#_s15a ->
case addWordC# [x#_s15a 2##] of {
(#,#) l_s15c ds_s15d ->
case ds_s15d<TagProper> of ds1_s15e {
__DEFAULT ->
case int2Word# [ds1_s15e] of sat_s15f {
__DEFAULT ->
case bigNatFromWord2# sat_s15f l_s15c of ds2_s15g {
__DEFAULT -> NB [ds2_s15g];
};
};
0# -> NS [l_s15c];
};
};
};

which produces much better assembly code, especially if there is no carry:

    addq $2,%rax       ; add 2 to a machine word
setc %bl ; test the carry.
movzbl %bl,%ebx ; it could be done
testq %rbx,%rbx ; more efficiently
jne _blk_c17c ; with "jc"
_blk_c17i:
movq $NS_con_info,-8(%r12) ; alloc NS datacon value
movq %rax,(%r12) ; with the addition result as payload.
leaq -7(%r12),%rbx ; make it the first argument
addq $8,%rbp ; and then
jmp *(%rbp) ; call continuation
...

So why aren’t we always inlining naturalAdd? We even explicitly disallow it with a NOINLINE pragma. The reason is that naturalAdd and friends are involved in constant-folding rules.

For example, consider:

bar :: Natural -> Natural
bar x = x + 2

baz = bar 0x12345678913245678912345679123456798

Currently we get the following Core:

bar1 = NS 2##

bar = \ x_aHU -> naturalAdd x_aHU bar1

baz = NB 99114423092485377935703335253042771879834

You can see that baz is a constant thanks to constant-folding.

However if we let naturalAdd inline we get:

baz
= case bigNatAddWord# 99114423092485377935703335253042771879832 2##
of ds_d11H
{ __DEFAULT ->
NB ds_d11H
}

baz is no longer a constant.

A solution would be to add constant-folding rules for BigNat# functions, such as bigNatAddWord#. This is exactly what we have started doing in #20361. Our new plan is:

  • Make BigNat# operation NOINLINE and add constant-folding rules for them
  • Make Integer/Natural operations INLINEABLE (expose their unfolding)
  • Hence rely on constant-folding for Word#/Int#/BigNat# to provide constant folding for Integer and Natural

The good consequences of this plan are:

  • Less allocations when bignum operations are inlined and some of the arguments are known to be small/big or fully known (constant).
  • Integer and Natural are less magical: you can implement your own similar types and expect the same performance without having to add new rewrite rules

There were some unforeseen difficulties with this plan though:

  1. Some of the rewrite rules we need involve unboxed values such as BigNat# and Word# and the weren’t supported. Luckily, this has been recently fixed (#19313) by removing the “app invariant” (#20554). Thanks Joachim! That’s the reason why we could resume this work now.
  2. Some unfoldings (RHSs) become bigger due to the inlining of bignum operations. Hence they may not themselves be inlined further due to inlining thresholds even if it would be beneficial. A better inlining heuristic would fix this (see #20516). It will likely be the topic of the next post.

· 3 min read

JS Backend

In March the team focused on porting more GHCJS code to GHC head.

  • Most of us are new to GHCJS’s codebase so we are taking some time to better understand it and to better document it as code gets integrated into GHC head.
  • Development process: initially we had planned to integrate features one after the others into GHC head. However it was finally decided that features would be merged into a wip/javascript-backend branch first and then later merged into GHC head. After trying this approach we decided to work directly into another branch: wip/js-staging . Opening merge requests that can’t be tested against a branch that isn’t GHC head didn’t bring any benefit and slowed us too much.
  • Documentation: we wrote a document comparing the different approaches to target JavaScript/WebAssembly https://gitlab.haskell.org/ghc/ghc/-/wikis/javascript
  • RTS: some parts of GHCJS’s RTS are generated from Haskell code, similarly to code generated with the genapply program in the C RTS. This code has been ported to GHC head. As JS linking---especially linking with the RTS---will only be performed by GHC in the short term, we plan to make it generate this code dynamically at link time.
  • Linker: most of GHCJS’s linker code has been adapted to GHC head. Because of the lack of modularity of GHC, a lot of GHC code was duplicated into GHCJS and slightly modified. Now that both codes have diverged we need to spend some time making them converge again, probably by making the Linker code in GHC more modular.
  • Adaptation to GHC head: some work is underway to replace GHCJS’s Objectable type-class with GHC’s Binary type-class which serves the same purpose. Similarly a lot of uses of Text have been replaced with GHC’s ShortText or FastString.
  • Template Haskell: GHCJS has its own TH runner which inspired GHC’s external interpreter (“Iserv”) programs. We have been exploring options to port TH runner code as an Iserv implementation. The Iserv protocol uses GADTs to represent its messages which requires more boilerplate code to convert them into JS because we can’t automatically derive aeson instances for them.
  • Plugins: we have an MR adding support for “external static plugins” to GHC !7377. Currently it only supports configuring plugins via environment variables. We have been working on adding support for command-line flags instead.
  • Testsuite: we have fixed GHC’s build system so that it can run GHC’s testsuite when GHC is built as a cross-compiler (!7850). There is still some work to do (tracked in #21292) to somehow support tests that run compiled programs: with cross-compilers, target programs can’t be directly executed by the host architecture.

Misc

  • Performance book: some time was spent on the infrastructure (CI) and on switching the format of the book to ReStructured Text
  • Modularity: some time was spent discussing GHC’s design and refactoring (c.f. !7442 and #20927).

· 2 min read

JS backend

This month we worked on adapting code from GHCJS to merge into GHC head. We also started discussing the implementation process publicly and especially with our colleagues at Well-Typed.

  • Ticket about adapting GHCJS’ code into a proper JS backend for GHC has been opened [#21078]. Feedback was very positive!
  • There were discussions about the process and an agreement to target GHC 9.6 release [email on ghc-devs, wiki page]
  • deriveConstants is a program used to generate some header file included in the rts package. While it is mainly useful for native targets, we had to make it support Javascript targets [!7585]
  • Javascript is going to be the first official target platform supported by GHC that has its own notion of managed heap objects. Hence we may need a new RuntimeRep to represent these values for Haskell codes interacting with JS codes via FFI. We opened !7577 into which we tried to make this new RuntimeRep non JS specific so that it could be reused for future backends targeting other managed platforms (e.g. CLR, JVM). It triggered a lot of discussions summarized in #21142.
  • GHCJS’s code generator was ported to GHC head [!7573]. In its current state, we can generate Javascript unoptimised code -- the optimiser hasn’t been ported yet -- by compiling a module with -c -fjavascript. It required many changes, not only to adapt to changes between GHC 8.10 and GHC head but also to avoid adding new package dependencies. It was also an opportunity to refactor and to document the code, which is still a work in progress.
  • GHC doesn’t use any lens library, hence to port the code generator we had to replace lenses with usual record accessors. It turned out that case alternatives in STG lacked them because they were represented with a triple. We took the opportunity to introduce a proper record type for them !7652

Plutus-apps JS demo

  • We improved the proof of concept JavaScript library for generating Plutus transactions with a given set of constraints and lookups, exposing functionality from the plutus-ledger-constraints package. [Report]

Reporting

· 9 min read

IOG is committed to improving Haskell developer experience, both by sponsoring the Haskell Foundation and by directly founding a team committed to this task: the Haskell DX team.

The team now tries to provide regular (monthly) updates about its work. This post is a bit longer because it covers all of 2021 which has not been covered anywhere else.

Code generation

  • Added a new backend for AArch64 architectures, especially to support Apple’s M1. Previously AArch64 was only supported via the LLVM based backend which is much slower. [!5884]
  • Added support for Apple’s M1 calling convention. In GHC 9.2.1 it implied making lifted sized types (e.g. Word8, Int16...) use their unlifted counterparts (e.g. Word8#, Int16#...); in GHC 8.10.7 – a minor release –  a less invasive but more fragile solution was implemented [commit].
  • Fixed a very old GHC issue [#1257] by making GHCi support unboxed values [!4412]: ByteCode is now generated from STG instead of directly from Core. It allows more Haskell codes to be supported by HLS and it even allows GHC code to be loaded into GHCi [link].
  • Fixed a bug in the Cmm sinking pass that led to register corruption at runtime with the C backend. Even if we don’t use the C backend, fixing this avoided spurious errors in CI jobs using it [#19237,!5755]
  • Fixed a register clobbering issue for 64-bit comparisons generated with the 32-bit x86 NCG backend [commit].
  • Fixed generation of switches on sized literals in StgToCmm [!6211]
  • Fixed LLVM shifts [#19215,!4822]

Linker

  • Fixed an off-by-one error in the MachO (Darwin) linker [!6041]. The fix is simple but the debugging session was epic!
  • Fix to avoid linking plugin units unconditionally with target code, which is wrong in general but even more so when GHC is used as a cross-compiler: plugins and target code aren’t for the same platform [#20218,!6496]

Cross-compilation

  • With John Ericson (Obsidian Systems) we finally made GHC independent of its target [!6791,!6539]. It means that there is no need to rebuild GHC to make it target another platform, so it now becomes possible to add support for a --target=... command-line flag [#11470]. It also means that a cross-compiling GHC could build plugins for its host platform in addition to building code for its target platform.
  • A side-effect of the previous bullet is that primops’ types are now platform independent. Previously some of them would use Word64 on 32-bit architectures and Word on 64-bit architectures: now Word64 is used on every platform. A side-effect of this side-effect is that we had to make Word64 as efficient as Word: it now benefits from the same optimizations (constant folding #19024, etc.). On 32-bit platforms, it reduced allocations by a fair amount in some cases: e.g. -25.8% in T9203 test and -11.5% when running haddock on base library [!6167]. We hope it will benefit other 32-bit architectures such as JavaScript or WebAssembly.
  • GHC built as a cross-compiler doesn’t support compiler plugins [#14335]. We have been working on refactoring GHC to make it support two separate environments in a given compiler session – one for target code and another for the plugin/compiler code. The implementation in [!6748] conflicts quite a lot with the support of multiple home-units that was added at about the same time. GHC needs to be refactored a lot more to correctly support this approach, so instead we implemented a different approach to load plugins which is more low-level and bypasses the issue [#20964, !7377].
  • We made GHC consider the target platform instead of the host platform in guessOutputFile [!6116]
  • Use target platform instead of host platform to detect literal overflows [#17336,!4986]

GHCJS

  • We updated GHCJS to use GHC 8.10.7 [branch]
  • We worked on making GHCJS’s codebase more suitable for integration into GHC: reducing the number of dependencies, avoiding the use of Template Haskell, reusing GHC’s build system, etc. There is now a GHCJS integrated into a GHC 8.10.7 fork [branch].
  • This experience led us to plan the realization of a JS backend into GHC head based on GHCJS. More information about this topic in our next report.
  • We worked on making GHC’s testsuite pass with GHCJS, triaging tests that legitimately fail on a JS platform from tests revealing real GHCJS issues. [LINK]

Windows

  • We seemed to be the first to try to build GHC on Windows with the updated GNU autotools 2.70 and this release made a breaking change to the way auxiliary files (config.guess, config.sub) were handled, breaking GHC’s build (#19189). The root cause of the issue couldn’t be easily solved so we modified GHC’s build system to avoid the use of these auxiliary files, bypassing the issue. Most GHC devs won’t ever notice that something was broken to begin with when they will update their GNU toolchain on Windows. [!4768,!4987,!5065]
  • Fixed cross-compilation of GHC from Linux to Windows using Hadrian [#20657,!6945,!6958]

Numeric

  • Fixed Natural to Float/Double conversions to align with the method used for Integer to Float/Double and added missing rewrite rules [!6004]
  • Made most bignum literals be desugared into their final form in HsToCore stage instead of CoreToStg stage to ensure that Core optimizations were applied correctly to them [#20245,!6376]
  • Some constant folding rules were missing and were added:
  • Allowed some ghc-bignum operations to inline to get better performance, while still managing to keep constant-folding working [#19641,!6677,!6696,!6306]. There is some work left to do (cf #20361) but it is blocked by #19313 which in turn is blocked by #20554 which should be fixed soon (!6865, thanks Joachim!).
  • The ubiquitous fromIntegral function used to have many associated rewrite rules to make it fast (avoiding heap allocation of a passthrough Integer when possible) that were difficult to manage due to the combinatorial number of needed rules (#19907, #20062). We found a way to remove all these rules (!5862).

Technical debt & modularity

  • Made several component of the compiler independent of DynFlags (parsed command-line flags):
  • Made the handling of “package imports” less fragile [!6586] and refactored some code related to dependencies and recompilation avoidance [!6528,!6346].
  • Abstracted plugin related fields from HscEnv [!7175]
  • Made a home-unit optional in several places [!7013]: the home-unit should only be required when compiling code, not when loading code (e.g. when loading plugins in cross-compilers #14335).
  • Made GHC no longer expose the (wrong) selected ghc-bignum backend with ghc --info. ghc-bignum now exposes a backendName function for this purpose [#20495,!6903]
  • Moved tmpDir from Settings to DynFlags [!6297]
  • Removed use of unsafePerfomIO in getProgName [!6137]
  • Refactored warning flags handling [!5815]
  • Made assertions use normal functions instead of CPP [!5693]
  • Made the interpreter more independent of the driver [!5627]
  • Replaced ptext . sLit with text [!5625]
  • Removed broken “dynamic-by-default” setting [#16782,!5467]
  • Abstracted some components from the compiler session state (HscEnv):
    • unit-related fields into a new UnitEnvdatatype [!5425]
    • FinderCache and NameCache[!4951]
    • Loader state [!5287]
  • Removed the need for a home unit-id to initialize an external package state (EPS) [!5043]
  • Refactored -dynamic-too handling [#19264,!4905]

Performance

  • Made divInt#, modInt# and divModInt# branchless and inlineable [#18067,#19636,!3229]
  • Fixed Integral instances for Word8/16/32 and showWord to use quotRemWordN [!5891,!5846]
  • Improved performance of occurrence analysis [#19989,!5977]
  • Fixed unnecessary pinned allocations in appendFS [!5989]
  • Added a rewrite rules for string literals:
    • Concatenation of string literals [#20174,#16373,!6259]
    • (++) . unpackCString# ⇒ unpackAppendCString# leading to a 15% reduction in compilation time on a specific example. [!6619]
    • Compute SDoc literal size at compilation time [#19266, !4901]
  • Fix for Dwarf strings generated by the NCG that were unnecessarily retained in the FastString table [!6621]
  • Worked on improving inlining heuristics by taking into account applied constructors at call sites [#20516,!6732]. More work is needed though.
  • Fixed #20857 by making the Id cache for primops used more often [!7241]
  • Replaced some avoidable uses of replicateM . length with more efficient code [!7198]. No performance gain this time but the next reader of this code won’t have to wonder if fixing it could improve performance.
  • Made exprIsCheapX inline for modest but easy perf improvements [!7183]
  • Removed an allocation in the code used to write text on a Handle (used by putStrLn, etc.) [!7160]
  • Replaced inefficient list operations with more efficient Monoid ([a],[b]) operations in the driver [!7069], leading to 1.9% reduction in compiler allocations in MultiLayerModules test.
  • Disabled some callstack allocations in non-debug builds [!6252]
  • Made file copy in GHC more efficient [!5801]
  • Miscellaneous pretty-printer enhancements [!5226]
  • Type tidying perf improvements with strictness [#14738,!4892]

RTS

  • Fixed issues related to the RTS’s ticker
    • Fixed some races [#18033,#20132,!6201]
    • Made the RTS open the file descriptor for its timer (timerfd) on Linux synchronously to avoid weird interactions with Haskell code manipulating file descriptors [#20618,!6902].
  • Moved GHC’s global variables used to manage Uniques into the RTS to fix plugin issues [#19940,!5900]

Build system / CI

  • Fixed Hadrian output to display warnings and errors after the multi screen long command lines [#20490,!6690]
  • Avoided the installation of a global platformConstants file; made GHC load constants from the RTS unit instead, allowing it to be reinstalled with different constants [!5427]
  • Made deriveConstants output its file atomically [#19684,!5520]
  • Made compression with xz faster on CI [!5066]
  • Don’t build extra object with -no-hs-main [#18938,!4974]
  • Add hi-boot dependencies with ghc -M [#14482,!4876]

Misc

  • Stack: fixed interface reading in hi-file-parser to support GHC 8.10 and 9.0 [PR, Stack#5134]
  • Enhanced pretty-printing of coercions in Core dumps [!4856]

· One min read

Hopefully 2022 should be the year GHC will get a JavaScript backend without relying on GHCJS. This month the team has been busy planning the work that needs to be done to get there!

Cross-compilation

  • GHCJS has been updated to reduce the gap with GHC 8.10.7 codebase to the point that GHC’s build system is used to build GHCJS
  • Internal work planning for the integration of GHCJS into GHC
  • A different approach to load plugins into cross-compilers has been implemented [#20964, !7377]
  • GHCJS has been exercised to showcase compilation of some Plutus applications

Modularity

  • A few “subsystems” of GHC have been made more modular and reusable by making them independent of the command-line flags (DynFlags) [#17957, !7158, !7199, !7325]. This work resulted in a 10% reduction in call sites to DynFlags and has now removed all references to DynFlags up to the CoreToStg pass, which is almost the entire backend of GHC.

Performance

  • Jeffrey wrote a new HF proposal about writing a Haskell Optimization handbook and has started working on it