Sessions is temporarily moving to YouTube, check out all our new videos here.

Compile-time Optimizations for You and Me

Miguel Camba speaking at EmberFest in October, 2017
83Views
 
Great talks, fired to your inbox 👌
No junk, no spam, just great talks. Unsubscribe any time.

About this talk

More and more, frameworks are becoming smarter and optimizing our apps for us like compilers do. How does it work? Where are the limits of that? Can we haz nice thingz today? Ember's logic-less templates make it the framework best suited to perform static analysis and optimizations, and we are going to learn how to take advantage of it today and have a glance of what the future holds to automate this on the framework.


Transcript


Thanks everyone. I'm going to talk about compile-time optimizations today. I want to clear the air before I start. I swear there's not going to be a video game in this one. I know there's some rumours, it's just not true. So first of all, I'm a web developer from Spain on this corner, there. I was living in London before but I decided to move back. I don't really remember why I decided to do this. I can think enough good reason, but there actually you can see my home over there. but now I'm working for DockYard and you can find me on the internet on any of those handles. This is also going to be a talk about frameworks and compilers. They seem to be different things, but more and more they are becoming one and the same. I'm going to start with frameworks. Frameworks, if we think about what they are, they are tools we use for building applications. An application in itself, we have to describe what an application is. An a very naive sentence it's basically something that presents right HTML to the user at every time as the user interacts with application. Finish on the fifth on every web application. Frameworks as we understand them become more competent with time. In the beginning they were only a set of runtime libraries we used to make our day-to-day life easier. At the beginning they were mostly aggregation of libraries or utilities, polyfills to make things like interaction with jQuery. We then decided that JavaScript was not the best language ever and we started doing things like okay I want the right coffee script. Then I want the right Java Script by the next version of Java Script or we want the right tags script even. Also we saw that some task like writing CSS or templates could benefit from having some tooling to help manage complexity rather than change the syntax completely. That's why we created things like JSX, SASS, and a plethora of other things like minifiers, imitaofmisers, spriters, and after all we happened to create CLI tools to make all these things manageable. We used simple CLI, but there is plenty of them. And the next one I want to talk about is compilers. The definition of compilers from Wikipedia almost, it's programmes that we use to build other programmes by transforming high level code into low level code on an executable programme. An executable programme means application for us. If we think about what the general compilation per se, not like we developers understand compilation but how people in general understand compilation, is the action of assembling information from several sources into a single entity we can share. A few weeks ago, Tom Dale wrote this think piece about how frameworks of compilers are becoming one and the same. As for application become more complex. And talking about compilers, most of them follow this approach. They have this rough structure where you have a code analyser, which in the case of ember animals concretely on ember templates, because ember is a very template driven framework. You call component from template, you can, this code analyser is the Glimmer parcel. Then the intermediate code generator, it's the Glimmer pre-processor. And after that we may optimise the code, we'll come back to that later, and the final code generator is the thing that actually shifts what we call the wire format, which is the JSON instructor we are shipping right now to the client, at this thing optimised. Sorry run on the client. And this talk is going to be about these two areas, which to actually make the parallelism between frameworks and compilers, the only thing we're missing is these areas, or we were missing actually, because we actually have one of them already done, and the other one is coming. So I want to make or start with the good news, or the basic sentence, or the basic idea you want to struck from this talk is that Ember right now is the framework that will lead compile time optimizations in the future. There is no other framework right now that has stronger and better foundations to go in the front of development on this area, with maybe Angela following behind thanks to the type assisting in type script. And view and react they have some fundamental decisions that make this harder to happen. And we do acclaim that. Its like of all the statement. I claim that because Ember has made the decision far, far back in the past, almost six years ago. That these going to start paying dividends in the future. When the future, right now, it's starting to pay dividends, and it's going to become more of a good idea is the rule of the less power. This decision was made when designing handlebars, and it states that, it suggests the use of the less powerful programming language for a given task. In this case of Ember, and Ember templates that is handlebars for us, so if we have to put on a scale, Ember, from the less powerful, or all the frameworks from the less powerful templating language, or templating system, to the most powerful one, Ember would be the less powerful one, since it looks like this, where you can actually on a glimpse see what is the dynamic, because it's inside culabrasis, and what is dynamic, which is pretty much anything that looks like a estimate element. It's very limited, you cannot call or evaluate anything there. You cannot read reference things that live elsewhere. So it is very, very constrained, and the things you can do with this are limited, so you need to push the logic somewhere else. This is what we call logic less templates. In this scale, the next one is view. For those who don't know view, it's a framework that like reacts face on But it uses templates instead of JSX. So it looks more or less like this, you have the special attributes in this case v-for, to codify how this is supposed to behave, how this is supposed to iterate, generate if statements, and the syntax inside is to provide JavaScript is not the same JavaScript, and it's still pretty limited, in variance or variability, but it's slightly more complex than the one you have in Hundovas, because Hundovas is basically lifts a simple as it can possibly be. Then we have Angular, that is the same idea, but it's slightly more complex, in the sense you can use a user, user attracted by my function, tie this in to order by, so the rules, while still not as complicated as pure JavaScript, there is still more things or more things you can do on the templating system. And finally, you have react, and the react family like PreAc, is Inferno, another like that, that they decide to go the other way around and instead of putting logic in your templates, you generate your HTML inside your JavaScript. And this is the one by far gives you the more power in the sense you have JavaScript, and you have the full power of JavaScript to do whatever you want. So you find the library issues, external library issues, use dependencies. You have mid filter map, do mathematical operations, anything. So this is full power, and power seems like a good thing to have. I mean, who wouldn't like power? Power sounds like a very very, you want power, I do. Problem is it turns out that this action in compilers that the more flexible something is, the harder it is for a compiler to know how it works, and to optimise it. Because compilers can only optimise things that they know for sure how this is going to work. If there is something that is going to work in one way, 99% of the time, that's not enough, 99.9 is not enough. A compiler has to be sure that the thing you is doing is correct. So we put the frameworks on a scale, from more logic-less, so to speak, to full JavaScript, Ember, View and React are more or less restrained, although Ember is the most restrictive one, and react is the thing that you have JavaScript, and you can do whatever you want in there. And if we saw them by the approach they use we have the Vdom family, React and View, the thing with them is they create a virtual down structure, and whenever there is a change they create another one, second one has changed, and update or create a div of what it has to change. The thing is that both of them, they don't have the concept of dynamic and static attributes. And that means they have to compare everything, whether it is static or not. Meaning they are doing, they are comparing a p tack with a text, with the p-tack with a text, with, there is nothing inside that paragraph that is dynamic. So they're doing work that could be avoided. React, sorry, react on the other side, it is due to checking, it actually keeps track of the previous value on the old value and can do more fine grain updates and Ember used in the VM, it also can do fine grain updates, but it can also go beyond that and do other techniques that are not seeing the different that are more, they belong more to the realm of virtual machines, because it's what Glimmer is. Okay. With all of this, where do we, Ember, stand on optimising? Are we optimising already? And there is good news, that we are already optimising on a few things, and I'm going to go through some of them. If you remember the graph before, there was two optimising faces. The one that was the general optimization and then there was the machine dependent code optimizer. And we do have a machine dependent code optimizer. Everyone in this room has already, unless you are in a very old Ember, we have things like this, and as soon as we have a template like this, by just looking at this code, we can immediately distinguish what is dynamic because it's between colabrasis, and everything else is static. So when Glimmer generates the op codes for this, Glimmer turns out that there is not only one virtual machine, it has two virtual machines. It is dependent on machine, it has to take of a few things, it has to take care of static and dynamic content, it's the one has to take care about things like rehydration of server side render content. But it doesn't need to take care about other things like keeping track of if values have changed, because you only use this thing to render things from scratch. So there is nothing to track progress of. On the other side, you have the virtual machine, and this is the op codes for the virtual machine that as you see we have three dynamic elements and we have three instructions. We have from the template, distinguished what is static and what is dynamic, and we have generated different op codes from different virtual machines. That is what is called also a good example of machine dependent code optimizer, even if we are not actually talking about CPU architectures, but this is still the same concept. And today although it's not shipped in Ember, but this is already in Glimmer on the very latest betas, we or, not we because I haven't had anything to do with that, there is the optimizer compiler. If you have read the blog post that at EmberJS.com, I think it was published yesterday or the day before, about how this works, it's a very interesting concept, and it has a few optimizations. One of my favourite being the constant pool. The constant pool is a very simple concept in this template you have, it's a very simple template, you have diff, open a div with a class foo, and you have another div inside. You can, I mean this is simple as a template can be. And this is more or less the sudo code because this is not deciding how it's represented today, but it's more readable for humans. You have a few op codes. And you see that open LMN, you open div, you add class foo, they you open another div and then you close the second div and then the first div. So you just generate things like this. And in here you are repeating div twice. In this template it's true, we have div twice. We have two divs. But in your application, at least in my applications, I would probably repeat div a few thousands times, like 10s of thousands, at least. So over time, this is a lot of repetition, not only repetition that generates space on the payload, this also creates instant, string instances that then have to be garbage collected. So one optimization about this is that the optimising compiler has the constant pool that is, my bad. There. Has the constant pool that stores the string only once without repetitions in a single collection, and it can reference the string by index. In the binary compiler there is actually kind of a memory reference, but here we can think about this like the string and the positions you happens to be div in the class happens to be the string one, foo happens to be the index too, and if you have a thousand divs, you only hold the string div within quotes, once. Within repetition. So you still say thousands of strings, but that's even more because this safe space but if you go beyond there is the binary packaging which is the thing that I'm most excited about. That is if you represent these instructions. And you can represent these instructions with, by not in structure because there is only I believe 80-something instructions? But anyway, it's a very limited number. You can encode actually all of this, because there are just opening them can be this sequence of bits, because you know it, it's on a list, the next two bits are the number of arguments because every operation takes at most three arguments, so with two bits it's enough, so in this one you do know this one has one bit, so we have zero one, we have one bit, this is the argument. This means one, and this is empty because there is nothing there. There is no more arguments to this function. This has two, so we have instead of zero one we have one zero, this has a value, this has a value, and this is empty. And you get it real. Again, this instruction and these instruction are exactly identical, and then this and this are exactly identical, and this don't take anything, this is just empty. This is binaries presentation. When we think from reading this code that seeing this in one and zeroes, this seems like very big, but it's a binary representation whether strings, the thing we read are bytes, so if we actually convert this thing into 16 unsigned integers, that's the representation of the thing before. So we, with this approach, if you encode this thing or represent this thing into information they hold, this is a very compact form of representing the op codes we saw before, and that's the thing that Glimmer is trying to do in common releases. So we can download the application in binary, and this has the guarantee that this is always going to be smaller than the HTML it's coming from. So by using a JavaScript framework, you are not actually making an application bigger than the raw HTML it's representing. It's actually making it smaller than the raw HTML is representing. Which is something, it was never achieved because we never at this level of compression. And now this is our good news, I want to continue with the bad news. That is I heard you like components. Components are super nice, are the building blocks of our application, and they are super nice because they're usually super flexible. But that also means, if you remember, if something is super flexible, that also means you cannot optimise it. Because Glimmer cannot know what is happening within a component JavaScript file. Let's consider this example, you have an icon. My icon component takes icon=tomster, size=2, and it's going to generate this thing here. We can, by just looking at this understand how this works. I mean, it's so simple, you basically prefix this tomster with fa, and you put the size within fa-, and an x. So, so simple that you think, okay, maybe Glimmer should be able to understand how this works and do this thing for us, there is no point in doing this thing in run time because we right now, we are not a virtual machine, we know what the results are going to be. But if we look at the code, this is the code of this component. For us, it's not very complicated. But that means that if Glimmer was able to understand everything that is going in this file, we would be out of work because Glimmer would be replace humans. So we can not optimise any of this because Glimmer, Glimmer would have to be almost an artificial intelligence to understand JavaScript. Because it's too flexible. Let's see if we can take the opposite approach. We understand that Glimmer is very good at templates. So what if we generate, or we remove logic from the JavaScript file, and move all the logic to a templates? If with it, okay, this is something that is probably Glimmer is able to optimise, but no, Glimmer cannot optimise this thing, because two reasons: the first one is until the big model unification runs, I can actually not be 100% of what is the JavaScript file and template of my icon, because there is no static resolution that can tell me in compile time that the file, the JavaScript file of this thing is going to be this. One of the modern unification runs, we will probably be able, well we will surely be able to do this thing in compile time. The problem is, that even if we do that, this component, that is here, can, in theory, okay, we can understand, okay, it's only tack name, it doesn't have anything really so it's probably safe to assume how it's going to work, but it's actually not true because components can be reopened later. Or you can have injections inside initiatlisers, or you can have this thing inheriting not from component but from some other suit class that does have behaviour, so again Glimmer cannot take the chances and assume it's going to work in a certain way, and have 1% of situations where it works in a different way. So the problem again is that once you cross the realm of JavaScript, everything is possible. That's the worse thing for a compiler. So we need, I decided to change the approach, and think whether or not there is something we can do that Glimmer cannot. We can make optimizations happen, even if Glimmer doesn't have the information, so we can decide that we are not using the features that make a component not optimizable. Based on this approach, I found that we can by also found for that, we just need to have less power. I set a few set of goals for this thing to be good enough to hit the public, and I came up with this list. It has to optimised as part of the regular workflow. I don't want people to run a special command to make it work. It has to generate the most optimal code possible given the information we have. It also has to assemble components because we like familiarity. And ideally, it should be fast enough and enable other kind of optimizations based on the analysis we do on the templates. For that we also need to give up a few things. We need to give up to dynamic tag names, reopening components, injections, although injections could be supported engagement. Not now, life cycle hooks. Some computer properties. You can emulate computer properties kind of. I found out that there is a tool in Ember already that enables this, and it's Ember handlebar AST transforms. AST transforms allow us to understand the templates and see what you are doing, and in compile time, do transformation over the templates to generate the end result whether when you know in compile time what the result is going to be. The bad news is they are hard to use and the worse news is they are private API. So that's why I started thinking that we cannot use this thing directly because it's private API, and I created this library which I'm gonna publicly releasing or advertise, or promoting right now. Ember ast helpers. It's a collection of helpers and utilities to make AST transforms simple enough so you can do compile time optimization on your own components, and not wait for Glimmer to be smart enough to do the optimization for you. Let's go through the goals. Optimised as part of the regular compilation. This is, I just go to the code, this is how it works. This, if you haven't seen this thing, this is a hook on the index AS of application and components, and you can produce the proposers and one of them is Ember AST plugins. So you can import and transform, you require it from your lip folder, and that's it. Producer, if you do this thing, whenever you save a file, whenever you stole a dependency, whenever you clean TMP, whenever you restart your server, this is going to work as expected. Actually Ember is already used in AST for a few things like transforming link tos. So if you haven't realised, that's good. The next thing, it's how the transformation is done. And this is hopefully simple enough basically if you define, you define your class, and transform that has a transform method that received the AST, and the AST transverse method that comes from Glimmer and you tell it, and if you have done or have used ES lean and created a rule, this is more or less the same thing. Use every time you see a moustache statement, you replace that moustache statement with my thing. We'll get about how this works in a moment. So save file, work, remove file it works, it's totally dependency, it works. Generates the most optimised code. This is the funny part and the thing I like the most. While developing the library. I wanted to try to see how optimised this thing could be. And for my icon, same instruction we thought before, this is the most optimal html, equivalent html code we can. So basically the library would be replaced in this invocation, by this html, so there is no component invocation run time, you know what the html is going to be. If you have something more complex like this one, you have spin through, sorry spin=spin. You need to do something like this. It generates if spin is true, then put the class fa-spin. However, if you don't have a bound value, you have a static value, you can just be smarter and if the string is true, you generate f-spin and there is no if to execute. And it applies to anything else, like you have a pull direction and you are interpolating direction here. But if you know the direction is left, you don't need to do interpolation, you can know in compile time what the result is going to be. So it's smart enough to generate the most optimal code and when it cannot know in compile time what the result is going to be, it can in a lot of occasions generate the equivalent if statement to generate the code you want. And it can even go further, and have, compile time components can have templates and in this situation, if this is the code, we very often do, if this is, if the component is flexible, you can invoke it with or without a block, and if it has a block, you jail, otherwise you do some default behaviour. In this case, if you invoke it this way, you know immediately right now that is not going to have a block. There is no blocking here. So basically, you are also assuming that person is another user, and you person name, basically that compiles down to use of the name. Because you can just remove the component, hoist the template, and it's like embedding, inlining then template when this component is invoked, and replacing the aliases you passed before, like person becomes user, and that's it, no component, and it is optimal as it can possibly be given the information we have. It reassembles components syntax because we like familiarity, I do like familiarity. And when things fit on my mental model, I feel more compelled to use them. And this also enables the optimizations on other layers of the stack because if we are compiling components statically, that also means we are only one step away from analysing how we use the components and use that information for other purposes, like in this situation, we are transforming all instances of FA icon, using this syntax. My icon note and you remove, and you convert into an element. This icon class, which is the class of the attribute, accepts a callback to do whatever you want, and in this case and remembering all the icons I use. So if I use five icons, I remember that use these icons, and I no other icon. And finally, I can use this information, and pass it to a post process, sorry a post-CSS plugin that is going to strip the CSS you don't use from your build. Does it work? Because this is quite a lot of complexity, you are not using the thing you are most familiar with that these components are using build time components. And it does work pretty well. This is some numbers from ember font awesome version four. It's in alpha. And this is a comparison with the latest version. This was used in other components, and this is completely compiled in build time. If we look at the CSS and assuming you use five icons, for example, this also becomes a pay as you go, so if you use only five icons, you only include the CSS of five icons. So before we were including around 30 kilobytes of CSS in seven, unified. And after, you are including like a 10th of that, that is 600 times less. Sorry, 600% less. And finally, memory which is something we don't usually check, I don't check it usually. If we render all the library of icons, immediately this is a bit contrary, for example. The application takes 27 megabytes of RAM. This is an empty application with only a lot of icons, and now it takes 19. So we may think this is 40% less memory. But if we think that application takes memory whether or not you use icons or not, I had to measure the baseline, and an empty application takes 17 megabytes. That also means that with this calculation, and assuming 17 megabytes is the minimum you can possible take, because it's an empty application, this is an increase of two megabytes versus an increase of nine megabytes, so this a real five times less, or in other words, 17% of the RAM you usually use. So these numbers are pretty big. The optimizations are amazing. And the point, the bottom line of what I'm trying to explain is you, or I want this library to enable people to do the optimizations that Glimmer can not do yet. I say cannot do yet because I hope that this library is similar to spriting. Spriting, dividing images, joining icons into single image to serve only one image, only one request, who has a very good optimization in HTTP one, but it seems to be not so good or even harmful sometimes in HTTP two. So the improvements in the browser made spriting from a good practise to actually a bad practise and I hope that Glimmer improvements will yield these library obsolete in the future. Glimmer components are going to unlock many optimizations in Ember. And I want to explain why. This is an example component: my component titled, Hello World. And we see this is a string, this is never going to change. It's a string. So having this template, we may naively think that the result is going to be this, but again, this is not true. This could be changing because you have some JavaScript file, this is inheriting from some other library, so you don't have the guarantee that nothing is messing with title. Maybe you have a service somewhere else, looking at the container, all components, and changing title. It's madness, I know, but it's possible. In Glimmer, in Glimmer, if you add title, hello world, and use add title in the template, you have the hard guarantee that attributes passed in are never modified, and if something starts with a not sign, you know that never, ever is going to be mangled by anything. So you can do these kind of optimizations, meaning that Glimmer, we able to replace, do these kind of replacements. It's going to be able to, oh, this component doesn't have a JavaScript file, it's only using @ signs. Don't, remove this thing. Don'tthis component, don't even ship this component. I know this is never going to actually run. I always can know the result in compile time. And it can even, Glimmer is very smart so it can even look at your templates, several levels deep. So if you have a component that passes title hello world, and then message title, and text text message, and then text, Glimmer can just remove everything and end up with only the minimum, or most of the modern placement. And again this is a contrived example, but there is the death of 1000 paper cuts, and Glimmer could see through your code, and optimise the small details you can see, you don't realise, Glimmer, obviously it's a software, and it won't make mistakes, it will see through their code, and do things for you. So that's the bottom line. Soon there are going to be a lot of improvements thanks to the Glimmer virtual machine, I think this is exciting times on development, in Ember in particular, and other frameworks will follow this approach because it's going to deal big wins for everyone. Again, I hope you use my library to experiment with optimizations and maybe gain knowledge that can later on be applied to the Glimmer virtual machine so this thing, it's made automatically by Glimmer. And that's everything I have.