Go here to download ready-to-run and source distributions of Jangaroo. more...

Wednesday, November 30, 2011

Simulating ActionScript in JavaScript: Private Members

Inspired by Adobe's Bernd Paradies, member of the FalconJS / FlashRT team, blogging about his ideas on "Compiling ActionScript to JavaScript", I thought it was about time to give more details about how Jangaroo implements ActionScript language features. Because there are so many of them, blogging about them one at a time seems like a good idea. Let's start with "private members".

The Early Days
When people started simulating class-based OOP in JavaScript, many thought that it is not possible to define private members that support the principle of information hiding, because in JavaScript, all properties of an object are publicly accessible. So in many cases, the convention was used to let private members start with an underscore ("_"), meaning "don't use this member, best assume it is not existing at all".

The Reference Solution
As we all know in the meantime, it is possible to implement real information hiding in JavaScript, namely by using closures / lexical scoping. Douglas Crockford (who else?) has written a nice summary of this approach back in 2001: Private Members in JavaScript
So it looks like all we have to do to simulate ActionScript's private members in JavaScript is to use this pattern, right? Of course, it is not that easy, as in practice, there are two problems with private members à la Crockford:
  1. Runtime overhead
  2. Source-code placement
While the first is a general problem, the second is Jangaroo-specific.
The runtime overhead of closures results from the fact that in Crockford's solution, private members and privileged methods (methods that use private members) have to be defined inside the constructor. This means that the corresponding function objects are created for every instance of the class! This results in a performance penalty as well as in increased memory usage. Of course, this is not an issue for classes with few instances (extreme: singletons), but for classes like events, where you create many, many instances, this can quickly become a major problem.
The second problem, source-code placement, results from a specific Jangaroo feature. In debug mode, the Jangaroo compiler generates JavaScript code that keeps every piece of executable code at exactly the same line as in the ActionScript source file. This is a central feature of Jangaroo and very important for source-level debugging with a free choice of JavaScript debugger. So how can we place private and privileged methods inside the constructor, when in the ActionScript source code, they are not?

The Super-Secret Private This
For solving the source-code placement issue, I came up with the idea of a "shared private this" object. It turned out that this approach makes things a bit more complicated and only mitigates (but not removes) the runtime overhead, so it did not actually make it into Jangaroo, but still I'd like to present it here.
The basic idea is that private members, as opposed to being local variables in constructor scope, are properties of a single object that is defined inside the constructor. To be shared, this "private this" object has to be "injected" into, i.e. put in the surrounding lexical scope of every privileged method. Consider the following class, which is the ActionScript version of the example used by Douglas Crockford:

 1 public class Container {
 2   public var member:String;
 3   private var secret:int;

 5   public function Container(param:String) {
 6     member = param;

 7     secret = 3;
 8   }
10   public function stamp(string:String):String {
11     return member + string;
12   }
14   private function dec():Boolean {
15     if (secret > 0) {
16       secret -= 1;
17       return true;
18     } else {
19       return false;
20     }
21   }
23   public function service():String {
24     return dec() ? member : null;
25   }
26 }
To inject the "private this" into all privileged methods, we wrap these inside functions. What Douglas calls that, we'll call this$, and all other helper idenfiers are suffixed with $. Here is the resulting JavaScript code:

 1 Container = (function() {
 2   //public var member:String;
 3   //private var secret:int;
 5   var Container = function(param) { var this$ = init$(this);
 6     this.member = param;
 7     this$.secret = 3;
 8   };
10   Container.prototype.stamp = function(string) {
11     return this.member + string;
12   };
14   var dec = function() {
15     if (this.secret > 0) {
16       this.secret -= 1;
17       return true;
18     } else {
19       return false;
20     }
21   };
23   var service$ = function(this$){return function(){
24     return this$.dec() ? this.member : null;
25   };};
27   var init$ = function(o) {
28     var this$ =
29       dec: dec
30     };

31     o.service = service$(this$);
32     return this$;
33   }
35   return Container;
36 })();
Note how all executable code stays in exactly the same line, and how similar it can be expressed in JavaScript!
The following variants of Douglas' patterns were used:
  • Private methods are only defined once (reduces runtime overhead!), and are supposed to be called on the "private this". If they need to access public members, too, we could easily extend the "private this" by a reference to the "public this".
  • Privileged methods are wrapped in a function that receives the "private this", so that the current "private this" is in lexical scope. 
  • To keep all source code at its original location, all additional initialization is moved to a generated method init$. It creates the "private this" this$ with all private members, creates instances of privileged methods handing in this$, and assigns these to the "public this".

The Pragmatic Solution
As said above, this solution still introduces runtime overhead and a bit complexity, so we looked for a more efficient and simpler alternative. I mentioned the naive approach to use a simple naming convention. The real flaw in this approach is not that you can access "private" members when you should not be able to. When compiling ActionScript to JavaScript, we can check access rights on the ActionScript source, so this is not an issue. But what this approach breaks is that private members are also used to avoid name clashes between subclass and superclass!
Image you define a framework, and provide some base class that is supposed to be extended by clients of the framework. This base class usually provides members with different visibility. The private members (and the "internal" ones, which we neglect for now) can not be seen by a subclass. A client's subclass could define its own private members any way they like, as long as they don't name-clash with public or protected members of the superclass.
Now you update the framework and add a private method, as you think this should not change the framework API. But if we use the naive naming convention of prefixing all private members with an underscore (or the like), there is now the chance that the new private member of the superclass name-clashes with an existing private member of the client's subclass!
Jangaroo's pragmatic solution to separate private members of different inheritance levels within one class is to suffix private member names with a $ followed by the numeric inheritance level. Object has inheritance level 0, and a class extending X has the inheritance level of X plus one. This effectively prevents name-clashes between private members of the same name, but defined on different inheritance levels, and thus solves the framework update problem described above.
We used to compute the inheritance level of a class at runtime, but this made class loading and initialization more complex. Thus, we later decided to let the compiler compute the inheritance level. Of course this reduces "binary" compatibility, meaning that when you refactor a framework class e.g. by introducing an intermediate class in the inheritance hierarchy, clients will have to recompile their code, or the inheritance level of their subclass will be incorrect. But recompiling (without changing the source code) after updating a framework should not really be a problem.
To give a concrete example, here is the simplified generated JavaScript code for the example above. It is not exactly what Jangaroo would produce, as there are many other features covered by the Jangaroo Runtime, which I'll elaborate on in upcoming blog posts. Also, it does not illustrate the inheritance issue, but just imagine a subclass that also defines a private field secret, which would then be renamed secret$2.
 1 Container = (function() {
 2   //public var member:String;
 3   //private var secret:int;
 5   var Container = function(param) {
 6     this.member = param;
 7     this.secret$1 = 3;
 8   };
10   Container.prototype.stamp = function(string) {
11     return this.member + string;
12   };
14   Container.prototype.dec$1 = function() {
15     if (this.secret$1 > 0) {
16       this.secret$1 -= 1;
17       return true;
18     } else {
19       return false;
20     }
21   };
23   Container.prototype.service = function() {
24     return this.dec$1() ? this.member : null;
25   };
return Container;
28 })();
Let me conclude with an overview of the four solutions for private members in JavaScript discussed here.
naive naming convention private members
(D. Crockford)
private this inheritance level suffix
(used in Jangaroo)
information hiding
avoiding name-clashes
no runtime overhead
keep source lines
As you can see in the table, there is no perfect solution (no column with checkmarks in every row), so for Jangaroo, we chose the pragmatic one that ensures good performance, while having some drawbacks on information hiding.

Since performance is the only argument against the "private this" solution, we should investigate in performance analysis of today's JavaScript engines to quantify the time and space overhead introduced by the appoach. Maybe it is not that bad after all.
With all modern browsers supporting the JavaScript API Object.defineProperty(),we can improve Jangaroo's pragmatic solution's information hiding by defining all private members to not be enumerable. Taking a closer look at ActionScript semantics, actually, all class members are not enumerable, i.e. they are all not visible in a for ... in loop (only dynamic properties are).
There are several possible improvements to approximate ActionScript semantics more closely when relying on features of a modern JavaScript engine (ECMAScript 5). I'll come to these when discussing further ActionScript language features.


Bernd Paradies said...

This is all very interesting. Thanks for sharing those details! The way Jangaroo hides private members is crafty. But I am wondering: Shouldn't most access violations due to accessing private members be caught at compile time? In theory the only reason for checking access violations at runtime is because of untyped code, i.e. assuming that foo() is a private method of Sprite:

var sprite: Sprite = new Sprite();
var spriteObj : Object = sprite;;

I am simply wondering: Shall Jangaroo care about those cases? I think so. But in my opinion Jangaroo should only care about access violations debug builds. In release builds I would not use any code that checks against access violations.
That is what I call the "dart rules". Please see this blog post if you are interested in the details and let me know what you think:

Frank said...

I hope I didn't miss the point of your question, but Jangaroo does not try to catch private member access violations at runtime. As said in the blog post, the reason to rename private members is to avoid name clashes, and all renaming is done at compile time.
However, you spotted an issue I skipped in the blog post: What about untyped access to private members? How can the Jangaroo compiler rename access to a private member, if the expression to the left of the dot is untyped?
Since the answer is rather long, I am already working on a follow-up blog post.
The idea you mention to generate different code for debug and release builds is already implemented in Jangaroo. We provide a built-in assert() function that is interpreted by the compiler: If assertions are disabled (using a compiler flag you could set when creating release builds), assert() function calls are completely suppressed in the generated code. So if there were any runtime checks for private member access (which, as said above, is not the case), they could be treated similarily.

Frank said...

The follow-up post about untyped private member access is online now!

Frank said...

One important argument against using JavaScript information hiding a la Crockford and also my "private this" solution I realized only now is that it does not resemble the ActionScript semantics!
While JavaScript privates are private to an object, ActionScript privates are, like in Java, private to a class. That means in ActionScript, an instance of class A may access a private member defined in A not only on "this", but also on any other instance of A. This is often used in equals() method implementations.
I think being more strict than the original ActionScript semantics is a no-go. Your code is "green" and compiles, but then at runtime, you are told " is undefined"?!

Frank said...

Another new idea for a little improvement:
Instead of using the pattern privateField$1, which is a valid ActionScript identifier and thus could name-clash with a field specified explicitly by the developer, we could better use a separator character that is not allowed in an ActionScript identifier, like ' ' (space). The generated code would have to use square brackets notation, like so: this['privateField 1'].
Obviously, the disadvantage is that it looks and types less nicely in the debugger.