Text

The pitfalls of self-hosting JavaScript

Recently the SpiderMonkey team has been looking into improving ECMAScript 6 and real world performance as part of the QuantumFlow project.

While working on this we realized that self-hosting functions can have significant downsides, especially with bad type information. Apparently even the v8 team is moving away from self-hosting to writing more functions in hand written macro assembler code.

Here is a list of things I can remember from the top of my head:

  • Self-hosted functions that always call out to C++ (native) functions that can not be inlined in IonMonkey are probably a bad idea.
  • Self-hosted functions often have very bad type-information, because they are called from a lot of different frameworks and user code etc. This means we need to absolutely be able to inline that function. (e.g. bug 1364854 about Object.assign or bug 1366372 about Array.from)
  • If a self-hosted function only runs in the baseline compiler we won’t get any inlining, this means all those small function calls to ToLength or Math.max add up. We should probably look into manually inling more or even using something like Facebook’s prepack.
  • We usually only inline C++ functions called from self-hosted functions in IonMonkey under perfect conditions, if those are not met we fall back to a slow JS to C++ call. (e.g. bug 1366263 about various RegExp methods)
  • Basically this all comes back to somehow making sure that even with bad type information (i.e. polymorphic types) your self-hosted JS code still reaches an acceptable level of performance. For example by introducing inline caching for the in operator we fixed a real world performance issue in the Array.prototype.concat method.
  • Overall just relying on IonMonkey inlining to save our bacon probably isn’t a good way forward.
Tags: mozilla
Text

JS Team Update

While it has been quiet on our blogs, the JavaScript team is actually working on all kinds of cool stuff. Of course I won’t be able to cover remotely everything.

For Generational Garbage Collecting we need to exactly root the JS Engine, that means you are able to find roots (pointers) to objects on the C-Stack. Because for a moving GC these pointers need to be updated. We have had a dynamic analysis that finds rooting issues for some time now, which used to cause thousands of failures when running our test suite.  In the last few weeks we minimized this numbers to almost zero. (Bug 745742) The next big chunk of work is exactly rooting the browser. We also now have a static analysis that finds rooting issues based on program working with the sixgill GCC frontend. (Bug 831409) Terrence also very recently prototyped a bump allocation nursery. (Bug 706885)

The new baseline JIT, which features a much simpler design compared to JägerMonkey and is eventually going to replace it, now has a lot of required features. What is left is work like implementing debugging support or jumping between IonMonkey and the baseline JIT. (Bug 805877) Of course there are still more operations and use cases that can and have to be optimized.

One performance fault in SpiderMonkey for a long time was the hefty deoptimization when using indexed properties like obj[15] on regular objects compared to array objects. We are happy to report that this finally been fixed with Bug 827490! Brian is also working on an somewhat related issue when you filling in the elements of an array backwards. (Bug 835102)

IonMonkey naturally also received improvements, one particular achievement is the 30% increase in performance on the octane benchmark over the last two months. As well as fixes for various other benchmarks. Take a look at arewefastyet?

Ecmascript 6 (Harmony) features have been landing as well, like the new direct proxies and WeakMap/Set/Map functions. (Tracking Bug) There is however still work going on, for example on “Harmony modules”.

We are now storing the unaltered source of (nearly) every function, which means Function#toString results in the original code. This allows use to remove the complex and error-prone decompiler. Because we still want to produce good error messages we replaced it with an expression only decompiler.

An other exciting change is that we now have certain built-in functions self hosted in JavaScript itself. So you can actually look at the implementation of Array#forEach! This brings performance improvements for some code, because we are now able to JIT compile and inline more code.

We are removing E4X right now. *PARTY*

It really has been a long time, so I missed or glossed over a lot of changes. I am sorry!

Tags: mozilla
Text

HTML5 download attribute

I just committed my work in Bug 676619, which allows you to use the download attribute with <a> and <area>. This feature was previously implemented in WebKit. It allows you to easily offer files as downloads. I am mostly excited about the possible interactions with Blobs.

 var blob = new Blob(["Hello World"]);
var a = document.createElement("a");
a.href = window.URL.createObjectURL(blob);
a.download = "hello-world.txt";
a.textContent = "Download Hello World!";

David Walsh and HTML5 Rocks blogged about this feature before. I would be interested in hearing what kind of applications you have for this.

This was also my first web facing feature outside the JavaScript engine.

Tags: mozilla
Text

JavaScript Work Week & more

The last few weeks in my life have been both very taxing and unbelievable cool.

  • I went to Ireland for two weeks doing language study travel with my high school. A lot of fun and beautiful landscape. School starts at 9am, how cool is that? I actually had to leave one week earlier to attend the work week.
  • I was featured in Bonjour Mozilla.
  • And because it was already revealed: I will be doing an internship with Mozilla next year in Toronto, if everything goes well with visas etc.

So the Mozilla JavaScript team had their first work week in the last two years on 22th October until the 26th October in Mountain View. I actually arrived on Saturday already after about 27h of traveling. We usually had session from 11am to 3pm with an hour of lunch in between. Most of the session was explaining different internals of SpiderMoney. Dave Mandelin did an inspiring talk about “Interest and Impact”. We had talks about IonMonkey and SpiderMonkey in general. We concluded with a Brainstorming session about the future of SpiderMonkey/Firefox/Mozilla. We tried to find things that we think are important for the future (eg. FirefoxOS, Open Web) and how we could make them happen. We came up with literally pages of ideas. Including “better documentation”, “better communication”, “better debugging tools”, but also thinks like “Baseline Compiler” or “ES6 property refactoring” which are pretty SpiderMonkey specific.

I was very happy to meet some people I only knew under their IRC nickname or name, but had no idea what they look like, not even thinking of their voice. I was actually surprised that the team is so big! You just don’t even realize everyone who is involved when you are just working in your own little niche. There also quite some people from Europe, yaay. I talked to some people and got them to know a lot better. For example, Naveed Ihsanullah, who put a lot of work into creating this event (Thank you!), worked with Windows Internals before, once a “passion” of mine, but I never got to talk about it with somebody.

You can really feel the difference when you can just walk across the table and talk to somebody, instead of staying on the computer at night, because of your weird timezone. Some stuff I was never really able to make sense of, were so unbelievable easy to explain in person. Some days I would stay in the office until 11pm just to watch other people work :).

If you are keeping up with JS development it might be interesting to know that we started working on a new Baseline Compiler. We are still working heads-down on everything required to make Generational Garbage Collection possible, or even still fixing bugs in the Incremental GC! The JavaScript Debugging API v2 is also still a hot topic. Same for IonMonkey. Oh and people are working on ES6 support, for example Eddy on Harmony modules.

Oh and we had nice food, really, yummy! I think we sadly forgot to take a proper group photo. So here are most of us working!

I probably forgot thousands things I wanted to mention, but forgot in the last 3 weeks, when I was trying to get back into my regular school days.

Tags: mozilla
Text

JavaScript function name inference aka stop function naming mayhem

When you are a Firefox developer you are probably familiar with the interesting way of obtaining a stack-trace from JavaScript code.

(new Error()).stack
/* or */
Components.stack

And historically unnamed (anonymous) functions made the stack-trace totally useless, so for example Firefox code has the policy to name such functions instead. Here is a small example.

var Sandbox = {
execute: function () {
throw "Unsafe!";
}
}
try {
Sandbox.execute();
} catch (error) {
console.log(error.stack)
}

Historically this produced a stack-trace like this (executed in Scratchpad):

@Scratchpad:12
@Scratchpad:18

But in Firefox 17 you should see something like this. (This was implemented in Bug 433529)

Sandbox.execute@Scratchpad/1:12
@Scratchpad/1:16

So you don’t need to make up names like “SandboxExcute” for the function and still get something readable. At the moment Components.stack doesn’t yet include this magic, but it should land soon (Bug 805222). This name inference mechanism should be able to handle a few more cases, for details look at the first bug, I linked.

Tags: mozilla
Text

Berlin Compiler Meetup

A few weeks back, I went to the Berlin Compiler Meetup. Before that I spent a few enjoyable days in Berlin, and visited the Mozilla office there. I spent some quality time with Tim Taubert, while he was working on Bug 650968. It was quite refreshing to see how other people do coding, and how different the kind of processes are compared to the JS Engine. I also noticed some possible areas of improvements from our side. Eg. platform people have to use dump(new Error().stack) instead of some real Debugger. And because we don’t name anonymous functions, they have to make up names for all of them, to make the stack even remotely useful. Alex Crichton already wrote some patch for fix this.


So now let’s get to the actually Meetup. We had three different talks.
Marijn Haverbeke talked about the kind of language design ideas, which he applied in his toy language. This talk was quite interesting, because I have never really looked into this stuff too much. But I also need to admit that it was sometimes way beyond my understanding.

Alexis Sellier talked about the Lua VM. I think he did a good job explaining the basics. Eg. what is register-based interpreter, how does the byte-code look like… I looked into Lua previously, so I found his talk rather easy to follow.

I planned to give a talk about the upcoming IonMonkey Jit compiler. In the end I talked about a lot of different stuff, but not really what I planned for. I should have had some more slides, for example about our Value type. I wrongfully assumed that the most people at this meetup would have some knowledge about compiler design, SSA and be able to read x64 assembly ;). I also planned to talk about some of the optimizations we implemented in IonMonkey. What we actually did was probably more useful to the audience, we busted some performance myth, talked about JIT anti-exploitation techniques and how we do benchmark driven development.

Here are my slides and some graphs of the different compiler passes. Sadly there is no recording, so you probably won’t be able to make much out of it.

Tags: mozilla
Text

JIT Inspector and more

A quick Google search turned up nothing on planet, so I guess a lot of people haven’t seen Brian Hackett’s awesome experimental add-on JIT Inspector.

Prototype tool to track metrics for amount and optimization quality of JIT code executed.

Which basically means you can spot hot code points in JavaScript, which are often run or for some reason couldn’t be optimized by SpiderMonkey in a well enough fashion. You can also look at different metrics, which could possibly tell you how good SpiderMonkey (in particular Type Inference) “understands” this line of code.

For the moment this tool is mostly useful when you have a deeper understanding of the SpiderMonkey internals, but this could definitely become a very handy tool for web developers etc.

I already used this in a few cases and it really easy to spot problems, without having to start any kind of normal profiler like Shark.

Along these line Hannes Verschore has been working on some graphing tool for IonMonkey, which shows the time spent in different “modules” of the engine, like Garbage Collection, JIT Compilation or regular expression execution. To run this tool you need to compile IonMonkey, so for now you are better of looking at the pretty graphs.

Tags: mozilla
Text

Short update of what the JS team is at

We actually wanted to enabled Incremental GC on Nightly, but again we had some fallout and it had to be backed out again. Bill thinks it should reland at the end of the week.

We are happy to welcome Benjamin Peterson, who is going to join us this summer as an intern working on SpiderMonkey’s ES6 support. Benjamin is an active python contributor. He has already started implementing rest parameters.

Till Schneidereit, (a fellow German, finally!) started picking up some GC related bugs, thank you and feel welcome.

In an effort to reduce the memory usage of average JavaScript applications (MemShrink \o/), we came to the conclusion that it is okay to throw away JIT code compiled by Jäger on every Garbage Collection run. Unfortunately this doesn’t work very well for animation heavy scripts like games, where recompiling would introduce long pauses. Brian fixed that.

Jason showed us how to use the new Debugger API to debug JavaScript code running in Firefox.

David Mandelin and me blogged about the SpiderMonkey API (JSAPI), and what needs to change, C++ yeah!

The DataView object landed, thanks to the work of Steve.

Luke just finished a patch that is going to speed up the handling of some function parameters/variables. Besides blocking more IonMonkey performance improvements, it already showed 10% better scores on the v8 early-boyer benchmark. (Bug 659577)

Jan has been working on chunked compilation which should help IonMonkey with very large scripts. But because this is a very broad change and the Ion team likes to focus on stabilizing, fixing crashes and test failures first, this is going to land after the initial release. Luckily these kind of large scripts are uncommon for normal JavaScript, but they are often found in Emscripten compiled code. JägerMonkey (+TI) which has chunked compilation is still going to help those scripts.

Edit: Republished because of some tumblr problems.

Tags: mozilla
Text

How the bad JSAPI is hurting us

The JSAPI has been relatively stable since it’s incarnation together with the first version of JavaScript in 1998. At least till 2008 when we started doing some rather deep changes with the introduction of compartments, the removal of threads and more. In the most cases for more safety, speed and less memory usage. But you could probably still understand code written in that dark ages.

And in 1998, it probably only made sense to use C and thus also design a C API. So and that’s what we are left with today. Luckily not for too long anymore!

It’s time for a new API.

I agree to this sentiment wholeheartedly.

So let me enumerate two recent examples where a better and safer API could have stopped us from introducing bugs.

Case 1: off-by-one errors in JSClass initialization - Bug 747617

Every JavaScript object has a JSClass, this class describes how the object should behave when used in certain situations. It also holds some bits of information that are the same for every JavaScript object of that class. For example the class name, which you can get in JS by using Object.prototype.toString.call. But the main purpose is to implement special behavior for these object, through the usage of “hook"s. For example when you have a bit of experience with the ES5 specification, that’s what we use to implement pseudo internal methods like [[Construct]] (sort of at least). Because we obviously can’t use C++ virtual functions we are left with functions pointers in a struct. Because we can’t really enforce strict initialization rules and type checking failed, we actually assigned some the wrong function (or NULL) to some hooks after removing one function from the struct and not updating all uses in the correct way. This broke an old hack to support document.all("elementID") from ancient IE.

Case 2: JSVAL_IS_OBJECT - Bug 752226

So when I told you that JSVAL_IS_STRING checks if a Value (our way to represent the different types in JavaScript: Undefined, Null, Boolean, String, Number, and Object) is a String, and JSVAL_IS_NULL checks if a Value is null. What would you think is the purpose of JSVAL_IS_OBJECT? If you thought it’s checking if a Value represents an object you were right, but this is only one part of the answer. It also returns true when you actually passed a Null primitive to it. D'oh. This is kind of similar to the typeof operator, which also returns "object” for null and objects. This could have been a mistake, because Brendan already used that API in the wrong way when implementing JS. (Here is the implementation of typeof in Mozilla 1.0). So lets get to actual issue here. After you checked if something is an object it’s obviously okay to use JSVAL_TO_OBJECT and just do what ever you want with it. (JSVAL_TO_OBJECT is used to get a JSObject* out of the Value, like the name suggest this represent an actual object in JS). Of course a lot of people forget that this API returns a null pointer when you use it with a null primitive. When removing all the uses of JSVAL_IS_OBJECT, I noticed at least 5 places where this check was missing and would lead to null pointer dereference crashs. The correct way to check if something is an object would have been !JSVAL_IS_PRIMITIVE, obvious right?!

There are probably a lot of other cases of bugs that were introduced because of our bad API. I can think of at least three more, which often showed that the most developers using the API, don’t really know what they were doing, but can you blame them with functions like JSVAL_IS_OBJECT?

Here is a little gem Wes Garland shared with me that shows the bad zen of the JSAPI.

No, your method does not return false, you are simply making the JS engine stop dead in it’s tracks. If you want your JS function to return true, you need to make its implementation return JS_TRUE while setting *rvalp to JSVAL_FALSE.

Tags: mozilla
Text

Mozilla JS Holiday Update

Although we had holidays, things went smooth, and we had about 130 code landings since 5th December. (Depending on how good my result of mapping commits to bugs worked). This doesn’t include patches landing on the IonMonkey branch, which have steadily grown.

We still find some unnecessary code after the removal of the Tracemonkey. See the dependencies of Bug 698201.

Terrence Cole has been working on adding write barriers, which are required for a Generational GC to work.

Brain Hackett introduced an analysis to find objects that are not explicitly rooted (Bug 707049). We currently use a conservative stack scanner, which means we don’t garbage collect stack values that point to objects. But a conservative stack scanner doesn’t know if it’s actually just a integer, which happens to have a value to corresponds to an object address, or if it’s really a pointer to that object. So we can sometimes keep things alive longer than necessary. 

For a moving GC, we need to identify all roots of an object (every pointer in the program that points to a particular object), so we can move the object around in memory, and rewrite the pointer.

Brain Hackett also worked on introducing inline caching to IonMonkey. We a had quite a problems few with the inline caches in JägerMonkey on Amd64, that we are going to avoid this time :)

Bill McCloskey landed the Incremental Garbage Collector on the larch project branch, from my understanding he is now actually nearly done, but still needs to fix some stability issues.

Jeff Walden continued working on splitting the storage of properties and elements in every object (Bug 586842). After this work has been done, index access on every object is going to be as fast as on arrays. A typical element access looks like, this obj[123]. This is a lot of work because, he does the conceptual splitting of accessing an element or property already at the lowest level, so we don’t introduce bugs, and have a clean cut. Clean-ups in the parser, yummy.

Jan de Mooij has been fixing a lot of bugs in IonMonkey that made us crash when running the v8 or sunspider benchmark. Some of these bugs he fixed last week were in the Linear Scan Register Allocator.

David Anderson already started simplifying the IonMonkey design (sic!), as some unnecessary complexity has been identified. Like Jan he also fixed bugs/crashes in IonMonkey.

Nicolas B. Pierron and Eddy Bruel also have been eagerly working on Ion, implementing new operations (like multiplication, the this operator etc.) and fixing bugs.

IonMonkey now can already run some tests, but most of the time we still fallback into the interpreter.

Luke Wagner was responsible for the code cleanups mentioned in the beginning.

The mysterious Ms2ger is earning a gold star for removing includes of private SpiderMonkey headers from the rest of the mozilla source.

I have been working on some correctness issue and one performance regression from the tracer removal, because we don’t optimize Math.round in the same way anymore. I am also trying to convince everyone the remove E4X.

I hope, I didn’t forget anyone, and you didn’t have too much pain reading until here.

Tags: mozilla