Thursday, July 27, 2017

WKWebView for Clueless Mac Programmers

I have a big web app that I decided to try to package up as a Macintosh app. Unfortunately, I had zero knowledge about Mac programming. Like, I never owned a Mac. I didn't even know how to get the cursor to go to the start of a line or skip a word using the keyboard without having to look up Stack Overflow. I tried using Electron, but after spending a long time going through various documentation to rebrand and package things (the nw.js documentation is so much better. The nw.js documentation is always such a joy to read compared to the electron docs), I wasn't too satisfied with the result. It worked, but it was sort of clunky, and I think there was some weird sandbox thing going on that caused file reading to sometimes work but sometimes not. With Windows, it makes sense to use Electron because the Windows default browser engine has weird behaviour and not everyone has the latest version of Windows. But on the Mac, everyone gets free OS upgrades to the latest version and the browser engine is fairly decent, so there's no need to include a 100MB browser engine with an app. So I figured I could whip together a quick Mac application that's just a window with a web view in it in about the same amount of time it would take to debug the Electron sandbox issues.

<rant about Mac programming>
Programming for the Mac is just like using a Mac. Apple hides important details and tries to force you to do things their way. Apple keeps changing things underneath you so all the documentation online or in books is always vaguely out of date. It's also expensive. I bought the cheapest Mac mini with 4GB RAM and a hard disk for development, thinking I could do mostly command-line stuff, but that's not the case. You really need to work from Xcode, and Xcode is a pig of a program that takes up a lot of RAM and is sort of slow. I almost immediately had to switch to using an external SSD on USB to get any reasonable responsiveness from my system. Apple is really trying to stuff Swift down everyone's throats, but I opted to go with Objective-C because of my Smalltalk background. It's not bad except the syntax is somewhat awful. My main issue with it is that part of what Smalltalk so productive is that it comes with an advanced IDE that's super fast and makes it easy to browse around large application frameworks to figure out how to use an API. Objective-C comes with an overwhelmingly huge application framework as well, but Xcode is slow and pokey and doesn't come with good facilities for diving through the framework. Code completion is not good enough. There should be a quick way to find how other people call the same method, check out the documentation for a method, and to check out the inheritance tree. Xcode is more of a traditional IDE with some code completion. It would be nice if Xcode actually labelled all of its inscrutable icons too. No one knows what any of those buttons mean, but using those buttons isn't optional either. The latest MacOS/OSX versions do include a web view, but I always get the feeling that Safari developers don't really understand web apps and want to discourage people from making them. I find that they only implement just enough features to Safari support their own uses and then lose all interest in implementing things in a general way that can have multiple uses. For example, for the longest time, they refused to implement the download attribute on links because Apple didn't need it, so why should anyone else need it? Then, when they did implement it, it initially didn't work on data-urls and blobs because they didn't understand how important that was for web apps. Similarly, the new WKWebView initially could only show files from the Internet and not load up anything locally, making it useless for JavaScript downloadable apps. Then, even when they did fix it, things like web workers or XMLHttpRequest are still broken, really limiting its usefulness. 
</rant about Mac programming>

Anyway, I found a great blog post that shows how to make a one window app with a web view in it. It lists every step, so it's easy to follow along even with no understanding of Mac programming. It worked for me, but Xcode has changed its default app layout to use storyboards so some of the instructions don't work any more, and it used the old WebView which is very limited. The new WKWebView is better because it allows for JIT compilation of the JavaScript, and it comes with proper facilities for letting the JavaScript send data to native code (the old web view required a bad hack to do that). So here are some updated instructions:
  1. Get Xcode and start it up
  2. Create a new Xcode project
  3. Make a MacOS Cocoa Application
  4. Fill in the appropriate application info, choose Objective-C for the language
  5. That should bring you to the screen where you can adjust the project settings
    1. If you want to run in a sandbox, I think you have to turn on signing. I think Xcode will take care of getting the appropriate certificates for you (I had already gotten them earlier).
    2. At the bottom of the General settings, under "Linked Frameworks and Libraries", you should add the WebKit.framework
    3. In the Capabilities tab, you can turn on the App Sandbox if you want (I think this is needed for the Mac App Store). Be careful, there seems to be a UI bug there. Once you turn on the app sandbox, you can't turn it off from the UI any more.
    4. If you do enable the App Sandbox, you also need to enable "Outgoing Connections (Client)" in the Network category. This is required even if you don't use the network. WKWebView seemed to have problems loading local files if the network entitlement wasn't enabled.
  6. Go to your ViewController.h, and change it to
  7. #import <Cocoa/Cocoa.h>
    #import <WebKit/WebKit.h>
    
    @interface ViewController : NSViewController
    
    @property(strong,nonatomic) WKWebView* webView;
    
    @end
    
  8. When using storyboards, the app delegate doesn't have direct access to the view, so you have to control the view from the view controller instead.
  9. Then go to your ViewController.m. Usually, you would draw a web view in the view of the storyboard and then hook it up to the view controller. Although this is possible with the WKWebView, all the documentation I've seen suggest manually creating the WKWebView instead. I think this might be necessary to pass in all the configuration you want for the WKWebView. To manually create the WKWebView, add these methods that show a basic web page:
  10. - (void)loadView {
        [super loadView];
        _webView = [[WKWebView alloc] initWithFrame: 
            [[self view] bounds] ];
        [[self view] addSubview:_webView];
    }
    
    - (void)awakeFromNib {
        [_webView loadRequest:
            [NSURLRequest requestWithURL:
            [NSURL URLWithString:@"https://www.example.com"]]];
    }
    
  11. Now when you run the program, you should see the web page from example.com there.
  12. The next step is to create a directory with all your local web pages that you want to show. Create a folder named html in the Finder (i.e. just a normal folder somewhere outside of Xcode)
  13. Drag that folder onto your project in the file list. Enable "Destination: Copy if needed" and "Added folders: Create folder references"
  14. You should now have a html folder in your project. You can delete the original html folder that you created earlier in the Finder since you no longer need it any more. (You can confirm that the html folder will be included in your project properly by looking at your project file under the Build Phases tab, and  the html folder should be listed in the Copy BundleResources section)
  15. Create an index.html file in your new html folder. Put some stuff in it.
  16. To show that page, go to your ViewController.m and change the awakeFromNib method to this:
  17. - (void)awakeFromNib {
        NSString *resourcePath = 
            [[NSBundle mainBundle] resourcePath];
        NSString *htmlPath = [resourcePath 
            stringByAppendingString:@"/html/index.html"];
        NSString *htmlDirPath = [resourcePath 
            stringByAppendingString:@"/html/"];
        [_webView
            loadFileURL:[NSURL fileURLWithPath:htmlPath]
            allowingReadAccessToURL:
                [NSURL fileURLWithPath:htmlDirPath isDirectory:true]];
    }
    
    

Saturday, June 03, 2017

Nw.js vs. Electron

Today, I tried porting an html5 web app of mine into a desktop application. When it comes to running JavaScript programs on the desktop, there are two main choices: node-webkit (nw.js) and Electron. I wasn't sure which one to choose. I didn't think that my web app was very complicated, so I decided to use nw.js. It's simpler, older, has a easier programming model, and I've been happy when I've used apps based on nw.js in the past.

Using nw.js was great. It was so simple and easy to use. I just unzipped nw.js somewhere, dropped my own web pages there, and off it went. It was nothing like the days and days of agony involved in making a Cordova app. The amount of documentation was very manageable, so I was soon diving through it to figure out various polishing issues. And it was all pretty simple. Fixing the taskbar icon was one line. Making it remember the windows size from when it was last closed--also one line. Putting in a "save as" dialog was a little more work, but, again, nothing to sweat about.

Then, I decided that I wanted the save dialog to default to saving to Windows' My Documents folder. And that was hours and hours of agony. The nw.js API is pretty small, so I went through all the documentation with a fine-tooth comb, looking for how to do it, and I couldn't find anything. I then thought that maybe that API was in node.js, so I went through all the node.js documentation to find out how to do it--nothing. Then I thought there might be an NPM package to do it. After much searching, I turned up nothing. I think most people use node.js for server stuff, so they never need to store stuff in a user's Documents folder.

After hours of this, I took a peek at Electron, and it was right there. Electron has an extensive platform integration API for not only getting at the documents folder, but also for crazy MacOS dock stuff, etc. Electron is used by bigger companies that ship more complicated applications, so they care deeply at all the subtle platform integration issues needed for a polished app. As a result, Electron has a much deeper and much more extensive platform integration API than nw.js. Of course, the Electron programming model is more complicated than nw.js, so it seems like it will require a lot more code to be written to get things going. And there's a lot more documentation, so I don't think it's possible to read it all, like I could with nw.js. And I'm concerned there might be annoying configuration issues. But it looks like I'll have to move to using Electron.

So if you need extensive platform integration APIs, use Electron, despite the fact that it's more complicated. If you're making something more self-contained, like, say, a game, then nw.js is probably fine though, and you'll save time because it's so easy to set-up.

Update (2017-6-7): Apparently, there's another difference in philosophy between nw.js and Electron too. nw.js tries to create a programming environment that imitates a normal web environment as much as possible. Platform integration is implemented as minor embellishments on existing web APIs with reasonable defaults chosen. With Electron, using normal web APIs will work, but not well. Lots of platform integration features are available, but the programmer has to explicitly write separate Electron code to take advantage of those features, and the API isn't that nice (due to Electron's multi-process architecture and lots and lots of optional parameters). For example, to open a file dialog in nw.js, you can simply reuse the existing html5 file dialog APIs, and the return results are augmented with some extra path information that you can use to open files. To open a file dialog in Electron, you can't reuse your existing html5 file dialog code because Electron's implementation is missing a couple of features, so instead you have to make use of Electron's file dialog APIs. Electron's file dialog APIs are fine, but a little messy to set-up, and by default they aren't modal, so you have to jump through some hoops to get normal file dialog behavior.

Friday, May 12, 2017

Building a Basic City Builder

This post is me rambling about trying to understand how city builder games work by making a very, very simple city simulation model.

I love city builder games, but I always get very frustrated playing them. I think the problem is that most game designers and programmers of city builder games don't actually obsess about cities and don't spend hours reading about and thinking about cities. Now, Will Wright, the designer of the first SimCity, did spend a lot of time reading about the philosophy of cities, and the original SimCity was a great simulation for a game that had to be able to run on a 4.77Mhz computer. It was built around themes of how residents needed a balance of residential, commercial, and industrial zones and of how land values and transportation access was important. Later city builder games do include much more complex city models, but I find they lack any over-arching theme or philosophy about the nature of cities. How is it New Urbanism, one of the biggest movements in urban planning over the last few decades, not have any of its tenets reflected in any of the most recent city builders? The designers of the latest city builder games simply focus too much on the game aspects of city builders and not enough on the urban planning. They seem to design game mechanics and simulation parameters based on what seems "fun" instead of reflecting upon a philosophy or theme of what constitutes a city.

Part of the joy of cities is that they are full of stories. Every neighbourhood has a story about how it evolved and grew and all the little things that people do there. Where do people shop? How do they get to work? What do they do for fun? Every city has a different story. But most city builder games have their simulation parameters set in such a way that they can only tell one story. For example, SimCity 4, which I consider to still be the pinnacle of the SimCity series, has its simulation set-up in such a way that you almost inevitably end up with a city that looks like northern California. The simulated residents are highly biased in favour of driving, and you have little ability to influence that. The city simulation is resistant to the levels of densification typical of non-American cities. The simulation doesn't allow farms in built-up areas, but I encountered plenty of urban farms when I lived in Switzerland. Even a basic assumption of the game like the fact that you need to supply water and sewage infrastructure to have even a basic level of housing development isn't actually true. Dubai was able to build many towering skyscrapers that weren't hooked up to a sewage system. All of these sorts of assumptions and fixed parameters in the city simulation constrains what sort of cities can be produced in the game and restricts the types of stories that players can tell. Even worse, SimCity 5 completely abandoned all pretense of accurately simulating a city and embraced purely game-based mechanics for modelling cities.

There's a lot of buzz about Cities: Skylines, which was made by a Finnish developer that previously made some awful transportation simulation games. Their transportation simulator never worked well for trains and buses, and it still doesn't, but they did get it to work well enough for cars that they were able to make a financially successful city simulator. Similar to how the developers built many transportation games that focused on modelling minute details of bus scheduling and bus stops while completely missing the big picture understanding of how mass transit actually works, Cities: Skylines has a detailed city model underneath that provides a simulacrum of a city when viewed at scale, but has no meaningful philosophy in its design and doesn't make much sense when you poke into it. One of the major aspects of the simulation models the player's ability to move dead bodies through the city! I'm currently living in Toronto, and I can't help but think that Jane Jacobs would cry if she know how many YouTube videos there were of people building multi-block, multi-story so-called "optimal" intersections in Cities: Skylines. The sheer prevalence of these videos is a sign that the underlying simulation model and theme of the game is broken. Note to armchair city builders: if you're building a continuous flow intersection in your city, you've already failed.

Of course, it's easy to complain about things. It turns out that I don't really know how to make a better system. Although I think I figured out how to make reasonable transportation models many years ago, I've never figured out how the underlying economic models of city simulators should work. In fact, I'm not entirely sure how the economic models of existing city simulators are designed. As such, it's hard to know what their underlying assumptions are, how they might be wrong, and how they might be fixed. The economic models of games are obviously biased in favour of growth. If a player lays out tracts and tracts of residential zones in the middle of nowhere, people will suddenly build houses in those zones for no apparent reason. Admittedly, in many places in the world, this is a reasonable assumption. In the areas near large, booming, metropolitan centres, if the government were to spend millions to build out sewage, power, and highway infrastructure to an area and then zone it for a subdivision, developers would quickly build tracts and tracts of suburban housing there. And for gameplay purposes, it's important for the city simulation to be biased towards growth because players love the feel of an expanding city where bigger and better things are constantly being built (though playing a dying city where the infrastructure must be slowly rolled back as people move out and where its role has to be reinvented might make an interesting scenario). But is this biasing towards growth done in a heavy-handed way that restricts the ways that a city can evolve or in a subtle way that still lets players design a city the way that they want?

To get a better insight into the way these economic models might work, I dabbled a bit in reading academic papers on urban planning models, but I never could figure them out. I tried out a trick I figured out in high school and tried to find the oldest paper I could find on the subject, and I actually found one that was somewhat comprehensible: Kenneth Train's "a Validation Test of a Disaggregate Mode Choice Model." My takeaway from the paper is that real-world urban planning models are based on polling of a population and building statistical/fitting models of how this population weighs decisions on choices they make on where to live or get around. For people building a computer game simulation, then a micro-economic agent simulation should capture this. Basically, you have a statistical distribution of how for every 100 people, 30 of those people prefer a house with a yard, 20 people choose their home based on the quality of the schools, 35 need to live within 10 minutes of work, and 15 like having a lot of cultural amenities. Then during the game, you randomly generate people based on the statistical distribution, throw them into the city, and have them make individual choices based on their preferences. Then, you just have to choose an appropriate statistical model of people to get the biases you want for your game. In hindsight, this is pretty obvious. If you model a bunch of individual, different people, then in aggregate, you will get an accurate city model. This still left a big problem though. This agent simulation will accurately model the residents in a city, with all of its assumptions explicitly encoded. This approach doesn't really work in modelling a city's growth. Why do people move to a city? How do you bootstrap an initial population for a city that has no buildings, no residents, and no infrastructure? If a game just regularly generates random people based on a statistical distribution and throws them into the city, then the whole simulation is inherently biased towards growth again. It seems like too blunt an approach to the problem. Surely, there must be a more nuanced way of modelling growth that has a better philosophy behind it other than the theme of unlimited growth? Is there a way of modelling growth that provides more adjustment knobs that can be used to encode different assumptions about growth?

I thought about this growth problem for a few years, but I could never make any headway with it. Regularly generating new random people to move to a city in order to get some growth might work for competitive games of SimCity. If there are different cities with different amenities, then depending on how your city compares to others, newly generated people might choose to move to other cities instead of your one. This sort of model might work well for multiplayer competitive SimCity. But I couldn't figure out how a growth model would work for a single-player city building game. I decided that the only way to find a reasonable approach to this problem would be to actually build a small game where I could dig into the details of the problem. Hopefully, after being enmeshed in the details, I would be able to see something that I couldn't see from far away. I came up with a design for a small city simulator that would focus on the economic model (since I felt I already understood how to design the transportation model), and then it was just a matter of finding the time to build it.

Finally, last week on Thursday, I received a last minute e-mail saying a spot opened up at the TOJam game jam that was running during the weekend, so I decided that it was time to dive in. I had worked out a design for a simplified city builder earlier. The city builder would present the side view of a single street. Since the focus was on the economic model and streetscaping and not on transportation issues, there was no need for a full 2d city. Having a side view also meant that the game could have a simplified interface as well that might even work ok on cellphones. In the game, players would place individual buildings and not zones. I think most city builder players like to customize the looks of their cities, but placing individual buildings doesn't work well at a large scale. But on the small scale of a side-view game, I was hoping that placing individual buildings would be feasible. During the first day, I was able to finish coding up a basic UI that would let players plop buildings on the ground and query them. There was a floating artist at TOJam, Rob Lopatto, who drew some amazing pixel art for me of a house and two types of stores, which really looked amazing.



On the second day, I coded up a basic traffic model. Since I was just trying to make something as simple as possible in a limited time, I only modelled people walking between buildings at a fixed speed. Similar to SimCity 1-4, I modelled the aggregate effect of people walking around instead of actually modelling the specific, individual movements of those people on the road. I think the lessons of SimCity 5 and Cities: Skylines is that modelling the movement of individual cars can be slow and leads to strange anomolies, especially when there is extreme traffic. In real-life, during extreme traffic, people shift their schedules to travel during non-peak times or they change routes or they move. It is rare for a traffic situation to become so dire that people end up in multi-day traffic jams and never reach their destinations. The problem with modelling the aggregate effect of traffic is that the simulation simply outputs some traffic numbers for chunks of road. There's nothing to see, and players like seeing little people and cars moving around. So I had to code up a separate traffic visualization layer that would show people moving around proportional to the amount of traffic there is. I wasn't sure if I would end up showing the right traffic is I generated people doing whole trips (my queuing theory is really bad), so instead I used the SimCity 4 trick of randomly generating people to walk around for short sections of road and then have them disappear again. I could then just periodically generate new people on sections of road that weren't showing enough traffic over time in their visualization. Surprisingly, even though my simulation was small enough that I could simulate the whole world, 60 times a second, I still ended up using my geometric series approach in both the traffic visualization and parts of the simulation. It worked really well!

By the end of the second day though, I had a hit a wall. I still couldn't figure out how to model city growth. I could simulate people in the city, but I couldn't figure out how to get new people to move in. I didn't want to explicitly encode a rule for having people automatically move into the city. Perhaps I could come up with some sort of hacky rule for when new residents would be induced into moving into the city. The new rule would likely still have an emergent behaviour of causing an implicit bias towards growth in the city, but if the rule still made thematic sense, then it would be more satisfying and could be tweaked and improved later on. I started leaning towards the idea of using jobs to induce people to move to the city. If there were companies looking to hire in the city, then people would move there. That mostly makes sense, and avoid explicitly biasing the city simulation towards growth.

I still had a bootstrapping problem though. The companies in a city won't hire people unless they have customers and are making money. But if there's no one living in a city, then companies will have no customers and hence have no jobs. I could make companies hire people even when they have no customers, or I could maybe implement a hack where companies might tentatively hire people to see if they can make money and then fire them if it doesn't work out. I think games like SimCity and Cities: Skylines have a hack where cities with small populations have an explicit macroeconomic boost to industrial jobs. If you zone some industrial areas, some companies will move in and create some factories to employ people even if they have no customers and no one lives in the city. This seemed like just another artificial bias towards growth, even if it was in a different form, so I wanted something different.

Instead, I went with a different cheat in that I created a type of building that was self-sufficient and could supply an initial boost of employment without depending on the existence of other people or infrastructure. I opted for subsistence farming plots. They could provide a minimal income even in a city with no other people or infrastructure, thereby attracting an population base. 100-150 years ago, the Americas were settled by offering free plots of farming land to immigrants, so it's not entirely unheard of, though I'm not sure how realistic that assumption would be now. Once the simulated city developed a sufficient population, there would be enough collective demand to make stores or small workshops profitable so they will employ people, resulting is a positive feedback loop of growth. This ends up supporting a theme that a city is dependent on investments of infrastructure to support certain types of economic activity and growth. Or maybe is says that people power a city but it requires infrastructure to improve efficiency and productivity to unlock that power. In any case, I think those are reasonable philosophies around which a city simulation can be designed. I'd be a little bit afraid of making the rules too deterministic so that it feels more like a game than a story-generating city simulator (e.g. you need electricity to have a factory over size 3, you need an outside road link to let your industrial population grow more than 3000 or stuff like that). And there's another a danger of inadvertently building an arbitrary civilization simulator instead (e.g. you need iron mines, coal mines, and an iron smelter to build the steam engine, which is then a prerequisite to industrial age buildings etc). But it does show that this philosophical approach is broad enough to capture many different city models.

On the third and final day, I polished up the UI and tweaked the city model a bit. Since there were only three different building types, and given the shortness of time, the ending city model was still very simple, but it seemed to work well enough, and it helped me work out a different way to simulate growth in a city builder game. Here's an overview of the final simulation code:
  1. People without a job will go to work at the building with the greatest demand for workers (demand must be at least one full worker). Farms always need one worker while stores need workers proportional to the number of visitors/customers they have
  2. People without a home will move to any home they can find
  3. People will move to a home that's closer to their work than their current home
  4. People will visit the closest store to their home to buy things, but if the number of visitors exceeds the store's capacity, the people will move to the next closest store (and so on)
  5. People without homes or jobs will leave the city
  6. People will cause 1 unit of traffic for each square of the street they need to walk on to get from their home to their work
  7. Any building that still has a demand for workers that can't be filled from the local population will hire someone from outside the city (provided that person can find housing)
  8. Every 100 turns, rent will be collected from each building's residents and workers. Upkeep costs for each building will also be deducted.
Here's the final game.

Tuesday, January 24, 2017

Trying Out Some Emscripten on Chrome

Omber is the GWT JavaScript app that I'm currently working on. It runs in a browser, and I've also created an Android version using Cordova. It has some computationally-intensive routines, so it's sometimes a little sluggish on cellphones, which is understandable given how under-powered cellphones are. I've been looking at whether there are ways to improve its performance.

The cellphone version of Omber runs on Chrome (specifically, the Crosswalk version of Chrome). It's unclear how to get optimal performance out of Chrome's V8 JavaScript engine. The Chrome developers talk a lot about how great its Turbofan optimizer is, but they never actually give any advice on how to write your code to get the best code generation from Turbofan. My code does a lot of floating point math, and I really need the numbers to be packed tightly to get the best performance out of the system. Should I be manually using Float64Arrays to do this? Or is V8's Turbofan smart enough to put them directly into objects? Are there ways I can add type hints to arrays and other methods? Can I reduce the number of array bounds checks? In a language like C++, I could simply write my code in a way that would produce the code generation that I wanted, but how do I guide Chrome into generating the code that I want?

Mozilla has their Emscripten project that can compile C++ to JavaScript asm.js style code. Firefox then has a special optimizer for translating JavaScript written in the asm.js style into highly optimized machine code. Personally, I think asm.js isn't a great idea. The asm.js subset is very limiting and sort of hackish. As far as I can tell, the code it produces is not very portable either. Basic things like memory alignment and endianness are ignored or simply handled poorly. For these reasons, most of the other browsers don't support asm.js-specific code optimization, but they claim that their optimizers are so good that their general optimization routines will still get good performance out of asm.js code.

So is it worth using Emscripten or not then? To try things out, I made a small test where I took my polygon simplification code and rewrote it in C++, compiled it using Emscripten to JavaScript, and compared the performance to my original GWT code. I was too lazy to record the actual numbers I was getting during my benchmarking runs, but here are the approximate numbers:

Original code on Chrome: ~280ms
Emscripten code on Chrome: ~230ms
Emscripten code with -O2 on Chrome: ~300ms
Original code on Firefox: ~4000ms
Emscripten code on Firefox: ~160ms
C++ code: ~150ms

Takeaways:


  • The Firefox code optimizer isn't very good, so having a special optimizer for asm.js is really useful for Firefox. Firefox was able to get performance that was pretty close to that of raw C++ code when dealing with asm.js code though.
  • The Chrome optimizer is so good that the performance of the normal JavaScript code is almost as good as the Emscripten code. In fact, it probably wasn't worthwhile rewriting everything in C++ because I could have probably gotten similar performance by trying to optimize my Java(Script) code more
  • Since the Chrome optimizer isn't specifically tuned for Emscripten code, the Emscripten code might actually result in worse performance than JavaScript depending on whether Turbofan is triggered properly or not. For example, compiling Emscripten code with more optimizations (i.e. -O2), actually resulted in worse performance from Chrome
I was a little worried that Chrome's V8 engine might be tuned differently on cellphones, meaning that I might not get similar performance numbers when running on a cellphone. So I also ran the benchmarks on Cordova:


Original code on Chrome: ~2600ms
Emscripten code on Chrome: ~1600ms
Emscripten code with -O2 on Chrome: ~2800ms

Here, we can see that the Turbofan optimizer is still triggered even on cellphones, and the resulting code performs much better than the original JavaScript code. The Turbofan optimizer still isn't reliable though, so you might actually get worse performance depending on the Emscripten code output.

I'll probably stick with the Emscripten version for now, but I'll later try to optimize my original JavaScript and see if I can get similar performance out of it. It would be nice if I could just link my C++ code directly with JavaScript, but Cordova doesn't allow this. In Cordova, all non-JavaScript code must be triggered asynchronously through messages, which isn't a good fit for my application. It might be possible to do something with Crosswalk, but it seems messy and I'm too lazy. 

Alternately, I could try using Firefox on cellphones since its optimizer can get performance that's near that of C++, but the embedding story is a little unclear. The Mozilla people abandoned support for embedding their Gecko browser engine, and they ceded that market entirely to Chrome/Blink. They now realized that it was a mistake and they're trying to get back in the game with their Positron project etc, but I think they've entirely missed the point. They're building an embedding API that's compatible with Chrome's CEF, but Chrome's CEF already works fine, so why would anyone want to use Mozilla's version? The space to play in is the mobile market. Instead of wasting time on FirefoxOS, they should have spent more time working on embedded Firefox for mobile apps. An embedded Firefox for iOS with a JavaScript precompiler would be really useful, and Mozilla could dominate that space. Well, whatever.

Friday, September 23, 2016

It's Impossible to Write Correct JavaScript Programs

During a coding session involving JavaScript, UIs, and new asynchronous APIs, I realized that it's no longer possible to write correct programs with user interfaces any more. The JavaScript language has long had a problem with isolated language designers who tinker on their own isolated part of the system without seeing how things fit together as a whole. Though their individual contribution may be fine, when everything gets put together, it's a mess that just doesn't work right. Then, later generations of designers patch things over with hacks and fixes to "smooth things over" that just make the language more and more convoluted and complicated.

Right now, the language designers of JavaScript are all proud of their asynchronous JavaScript initiatives like promises, async, and whatnot. These "features" shouldn't be necessary at all. They are "fixes" to bad decisions that were made years earlier. It's clear that the designers of these asynchronous APIs mainly do server-side work instead of front-end work because asynchronous APIs make it next to impossible to write correct user interfaces.

In all modern UI frameworks, the UI is single-threaded and synchronous. This is necessary because UI code is actually the trickiest and hardest code to write correctly. People who write back-end code or middleware code or computation code actually have it easy. Their code can rely on well-defined interfaces and expected protocols to ensure the correctness of their algorithms. As long as your code calls the libraries correctly and follows proper sequences, then everything should work fine. By contrast, UI code interfaces with the user. The user is messy and unpredictable. The user will randomly click on things when they aren't supposed to. You might think, how hard can it be to write some code for clicking on a button? But what happens when they start clicking on different buttons with their mouse and finger at the same time? What happens when they start dragging the scrollbar in one direction while pressing the arrow keys in the opposite direction at the same time? Users will drag things with the mouse while spinning the mousewheel and then press ctrl-v on the keyboard and get angry when the UI code becomes confused and formats their hard drive. Then when you fix that problem, some other user will get angry because they were using that combination as a quick shortcut for formatting hard drives and want the old behavior back. Reasoning about the correctness of UI code is very hard, and the only thing that makes it tractable at all is that it's all synchronous. There is one event queue. You take an event from the queue and process it to completion. When you take the next event off the event queue, you don't know what it is, but it will be dependent on the new state of the UI, not the old one. You don't know what crazy thing the user is going to do next, but at least you know the state of the UI whenever the next event occurs.

Asynchronous JavaScript inconveniently breaks the model. All of these asynchronous APIs and promises are based on the idea that you start an action in another thread and then execute some sort of callback when the execution is complete. This is fine for non-UI code because you can use modularity to limit the scope of how crazily the state of the program will change from when you invoke the API and when the callback is called. It's even fine if these sorts of APIs are needed occasionally in UIs. During the rare time that an asynchronous XMLHttpRequest is needed, I can spend the day mapping out all the mischief that the user might do during that pause and writing code to deal with it when the request returns. But these asynchronous APIs are now becoming so widespread that I'm just not smart enough to be able to work out these details any more. The user clicks a button, you call an asynchronous API, then the user navigates to a different view, then the asynchronous call comes back to show its result, but all the UI elements are different now. The textbox where you wanted to show the result is no longer there. So now in your promises code, you have to write all sorts of checks to validate that the old UI actually still exists before displaying anything. But maybe the user clicked on the button twice, so the old UI still exists, but the 2nd asynchronous call returned before the 1st one, so now you need to write some custom sequencing code to make sure things are dispatched in the proper order. It's just a huge unruly mess.

The only practical solution I can find is to suppress the user interface during asynchronous calls, so that the user can't go crazy on the user interface while you're doing your work. This is a little dangerous because if you make a mistake, you might accidentally forget to unsuppress the user interface during some strange corner cases, but dealing with these corner cases is a lot easier than dealing with the corner case of the user generating random UI events while you're waiting on an asynchronous call. There was one proposal to add an "inert" attribute to html to disable all events, but that was eventually killed. Right now, the only hope for UI coders is to misuse the <dialog> tag, but very few browsers support it currently.

The annoying thing is that these things are just sad hacks that make programming more and more convoluted. Despite all the pride that the JavaScript designers have in their clever asynchronous promises API, that too is just a hack to paper over previous questionable decisions. The root cause of all these issues is the arbitrary decision that was made many years ago that there would be no multithreading in JavaScript. As a result, the only way to run something in parallel is to use the shared-nothing Web Worker system to run things in, essentially, separate processes. Although the language designers proudly proclaimed that there would be no concurrency errors because the system didn't allow shared objects or concurrency mechanisms, this system ended up being so limited that no one really used it. There were no concurrency errors in JavaScript programs because no one used any concurrency. (Language designers are now trying to "fix" Web Workers by creating a convoluted API that adds back in shared memory and concurrency primitives, but only for JavaScript code that is translated from C++.) Once JavaScript multithreading was killed, a certain old dinosaur of a browser company (no, not Microsoft, I meant dinosaur literally) discovered that their single-threaded browser kept hanging. Although every other browser maker moved to multi-process architectures that ensured that browser remained responsive regardless of the behavior of individual web pages, this single-threaded browser would become unresponsive if any tab made a long-running synchronous call. Somehow, the solution to this problem was to remove all synchronous APIs from JavaScript. And now we can't write correct UI code in JavaScript any more.

JavaScript is getting to be a big mess again. The fact that it's no longer possible to write correct user interface code any more is a clear signal that something has gone wrong. The big browser vendors need to call in some legendary language gurus to rethink the language and redirect it down a more sane path. Perhaps they need to call in some academics to do some original research work on possible better concurrency models. This has actually happened in the past, when Guy Steele was brought in for the original JavaScript standardization or when Douglas Crockford killed ES4. It looks like something like that is needed again.

Sunday, January 10, 2016

Java Metaprogramming Is Widespread But Slowly Dying

Metaprogramming is one of those cool academic topics that people always talk about but never seem all that practical or relevant to real-life programming. Sure, the idea of being able to reprogram your programming language sounds really cool, but how often do you need to do it? Is it really that useful to be able to change the behavior of your programming language? How often does a programmer need to do something like that? Shouldn't you be able to do everything in the programming language itself? It seems a lot like programming in Haskell--technically cool, but totally impractical.

I've recently started realizing that metaprogramming features in programming languages aren't important for technical reasons. Metaprogramming is important for social reasons. Metaprogramming is useful because it can extend the life of a programming language. Even if language designers stop maintaining a programming language and stop updating it with the new features, metaprogramming can allow other programmers to evolve it instead. Basically, metaprogramming wrestles some of the control of a programming language away from its main language stewards to outside programmers.

One of best examples of this is Java. Traditionally, Java isn't really considered to have good metaprogramming facilities. It has some pretty powerful components though.
  • It has a reflection API for querying objects at runtime. 
  • It has a nice java.lang.reflect.Proxy class for creating new objects at runtime. 
  • By abusing the classloading system, you can inspect the code of classes and create new classes. 
  • The JVM instruction set is well-documented and fairly static, making it feasible for programs to generate new methods with new behavior. 
The main missing pieces are
  • The instruction set is so big and complicated that it's cumbersome to analyze code or to generate new methods
  • You can't really override any of the JVM's behaviors or object behaviors
  • You can't really inspect or manipulate the running code of live objects
The crowning piece of the Java metaprogramming system though is annotations. To be honest, most of the real metaprogramming stuff is too complicated to figure out. Annotations, though, are simple. It's just a small bit of user-specified metadata that can be added to objects and methods. Its simplicity is what makes it so powerful. It's so simple to understand that many programmers have used annotations to trigger all sorts of new behaviors in Java. Annotations have been used and abused so much that their use is now widespread throughout the Java ecosystem. This type of metaprogramming is probably the most used metaprogramming facility in programming languages right now. 

I believe that metaprogramming through annotations has allowed Java to evolve and to add new features despite long periods of inactivity from its stewards. For example, during the 10 years between Java 5 and Java 8, there weren't any major new language features to the Java language. While Java was stagnating during that period, other languages like C# or Scala were evolving by leaps and bounds. Despite this, Java was still considered competitive with others in terms of productivity. One of the reasons for this is that Java's metaprogramming facilities allowed library developers to add new features to Java without having to wait for Java's stewards. Java gained many powerful new software engineering capabilities during those 10 years that put it on the leading edge of many new software practices at the time. Metaprogramming was used to add database integration, query support, better testing, mocking, output templates, and dependency injection, among others, to Java. Metaprogramming saved Java. It allowed Java to be used in ways that its original language designers didn't anticipate. It allowed Java to evolve and stay relevant when its language stewards didn't have the resources to push it forward.

What I find worrisome, though, is that the latest language developments in Java are weakening its metaprogramming facilities. Java 8 weakened metaprogramming by not providing any reflection capabilities for lambdas. Lambdas are completely opaque to programs. They cannot be inspected or modified at runtime. From a functional/object-oriented cleanliness perspective, this is "correct." If an object/function exports the right interface, it shouldn't matter what's inside of it. But from a metaprogramming perspective, this causes problems because any metaprogramming code will be blind to entire sections of the runtime. Java 9 will further weaken metaprogramming by imposing extra visibility restrictions on modules. Unlike previous versions of Java, these visibility restrictions cannot be overridden at runtime by code with elevated security privileges. From a cleanliness perspective, this is "correct." For modules to work and be clean, normal code should never be able to override visibility restrictions. The problem is that the lack of exceptions hampers metaprogramming. Metaprogramming code cannot inspect or alter the behavior of huge chunks of code because it is prevented from seeing what's happening in other modules. 

Although its great to see the Java language finally start improving again, the gradual loss of metaprogramming facilities might actually cause a long-term weakness in the language. As I mentioned earlier, I think the benefits of metaprogramming are social, not technical. It's a pressure valve that allows the broader programming community to add new behaviors to Java to suit their needs when the main language stewards are unable or unwilling to do so. With the language evolving relatively quickly at the moment, it's hard to see the benefits of metaprogramming. The loss of metaprogramming features will be felt in the future when outside developers can't extend the language with experimental new features and, as a result, the language fails to embrace new trends. The loss will be felt if there's ever another period of stagnation or conflict about the future direction of the language, and outside developers can't use metaprogramming to independently evolve the language. Hopefully, this gradual loss of metaprogramming support in Java is just a temporary problem and will not prove detrimental to the long-term health of the language.

Wednesday, December 09, 2015

Transcoding Some Videos

One of my websites has some videos on it, and I usually just embed some YouTube videos there. You don't have to pay for hosting it, YouTube takes care of encoding the videos so that it can be used on multiple devices, and you can potentially get some views from people searching for stuff on YouTube. But recently, I've started to get concerned about embedding third-party widgets like that. It's a little unclear how compliant my website can be with its privacy and cookie policy if these third party widgets can change their own cookie and privacy policies at will.

So I looked into what's involved in hosting the videos myself. It turns out the hit from hosting the videos myself wouldn't be too bad. Since the videos were video slideshows, it turns out they actually compress really well. I played with different ffmpeg, and I found that I could drop from the 50MB files that my video program produced to 5MB files by using two passes, variable bit rate, and a large maximum duration between key frames.

Now the second problem. There are two main video formats on the web: webm and mp4. Apple owns patents on mp4, and they purposely refuse to support any video formats except mp4 on their devices so that anyone who wants to provide video content to Apple users must pay for Apple's patents. I couldn't just use ffmpeg to transcode my videos to mp4 format because it doesn't come with a proper patent license (licenses are needed to decode and encode the h.264 video and AAC audio). I tried scouring around the Internet for a properly licensed version of ffmpeg that I could buy, but I had no luck in this. I could have just purchased a whole new video program with its codec packs, but it's hard to tell whether the codecs that come with a video program would expose the tuning parameters I needed to get the small sizes I wanted.

In the end, I went with a cloud transcoder since they presumably purchase a patent license for their services. It turns out most of the cloud transcoding services have gone bankrupt, so there's only a few big ones left like Amazon Elastic Transcoder, Zencoder, and Telestream cloud. Initially, I was leaning towards Zencoder because they were pretty upfront about the fact that they let you set all the ffmpeg parameters yourself and they said they 2-pass encoding. But the system seemed sort of messy--you needed to copy your files into s3 and give them read rights to it. At that point, it seemed easier just to go with Amazon since I already had an account with them. At first, I couldn't get the Amazon stuff to start, but apparently, the Amazon Transcoding console won't start until you upload a video file to s3 first, which is a little bizarre, but whatever. In the end, Amazon actually exposed the parameters I needed to get my slideshow to compress well, and the final file sizes seemed to be competitive with what I was getting from 2-pass encoding with ffmpeg myself, so I suspect that Amazon must be enabling that but not saying it in their documentation. The web interface requires you to manually enter in the settings for every single file you want to transcode, which is a pain, but I only had 15 videos or so, so it wasn't too bad. It's possible to script the transcoding using their APIs, but I was too lazy to do that.

Actually serving mp4 videos on a website also requires a patent license (separate from the patent license for encoders and decoders). Fortunately, I was offering free educational Internet videos, and those are exempt from royalties.

It's sort of annoying that the technical aspects of putting a video up on my website only took me an hour or two to figure out, but the process of trying to figure out how to do so legally ended up taking two days. I really hate how Apple is doing everything possible to sabotage the web and extract maximum profit from it--they patent important parts of the HTML specification, they refuse to support formats that can be used without patents, they refuse to support new standards in their browsers if it makes them competitive with apps--it's just ridiculous sometimes.