Design
This project is partially inspired by the HaskellWiki article “The JavaScript Problem”. This project aims to solve the JavaScript Problem in a way that cuts JavaScript out of the build process altogether. The article outlines many of the current issues with developing in JavaScript, as well as current paths of recourse that exist in a functional paradigm. In particular, in the final section of the article mentions Emscripten.
“Emscripten — not Haskell→JS, but compiles LLVM/Clang output to JavaScript. Could possibly be used for GHC→LLVM→JS compiling, which I tried, and works, but would have to also compile the GHC runtime which is not straight-forward (to me) for it to actually run.”
This helped inspire the initial design of a viable pathway from Haskell code to working WebAssembly through LLVM using Nix. This, in turn, helped inspire this project. The final product of this project will be a simple-to-use tool that consumes Haskell code and produces runnable WebAssembly.
LLVM is a sensible stepping stone for this project for several reasons. For one, LLVM is already the target of several widely used compilation tools. As such, many optimizations have been applied to the LLVM toolchain. Bringing LLVM into the project allows it to take advantage of all these optimizations. Furthermore, there already exist tools in the LLVM ecosystem for compiling LLVM to WebAssembly. Finally, GHC already has a working LLVM backend. This solidifies LLVM as a path of least resistance for this project. Currently, LLVM code produced by the GHC LLVM backend cannot be directly translated into WebAssembly with the available tools due to some compatibility issues. Making an altered version of the LLVM backend that produces code that is compatible with LLVM-to-WebAssembly compilation available as a part of GHC will be a task that is accomplished within the context of this project.
Drawing together separately developed tools into a single cohesive product is often a fragile process that yields brittle results. Dependency management, and build processes need to be handled with care. The Nix language and ecosystem exist to handle these tasks with strength and reliability. Compiling GHC-produced LLVM to WebAssembly with the available LLVM tools is likely to be a sensitive process. Refining this process and capturing that logic via Nix will be done as a part of this project.
Producing sensible WebAssembly from Haskell code is not enough to fulfill the goals of this project. The produced WebAssembly code will be unable to run in the abscence of a proper runtime. Specifically, the GHC RTS will need to be compiled to WebAssembly. The RTS exists as a combination of C and Cmm code within the GHC code base. The GHC LLVM backend already has the ability to take in Cmm code and produce LLVM, so the altered backend will serve double duty by aiding this process. A reliable C to LLVM compiler, Clang, already exists as part of the LLVM ecosystem. In addition, Emscripten furnishes implementations of several libraries, such as libc, that may be of use in this process. Leveraging these tools should make compiling the RTS to LLVM (and then WebAssembly) a far less daunting task.
A custom build process that builds GHC for the purpose of this project will need to be designed. This process can be encoded with Nix. In fact, John Ericson has already done significant work related to cross compilation with GHC using Nix. His work will aid in this cause, and is available via Nixpkgs.
This project will expose a simple-to-use interface to end users. Encoding the logic necessary to use the results of this project properly should be possible via Nix. The necessary calls to Nix can then be written into simple bash files such that usage of this project requires no knowledge of its inner workings or the tools it uses. In this manner, the end users of this project should be able to compile Haskell code to runnable WebAssembly with only a simple command.