Friday, May 12, 2017

Building a Basic City Builder

This post is me rambling about trying to understand how city builder games work by making a very, very simple city simulation model.

I love city builder games, but I always get very frustrated playing them. I think the problem is that most game designers and programmers of city builder games don't actually obsess about cities and don't spend hours reading about and thinking about cities. Now, Will Wright, the designer of the first SimCity, did spend a lot of time reading about the philosophy of cities, and the original SimCity was a great simulation for a game that had to be able to run on a 4.77Mhz computer. It was built around themes of how residents needed a balance of residential, commercial, and industrial zones and of how land values and transportation access was important. Later city builder games do include much more complex city models, but I find they lack any over-arching theme or philosophy about the nature of cities. How is it New Urbanism, one of the biggest movements in urban planning over the last few decades, not have any of its tenets reflected in any of the most recent city builders? The designers of the latest city builder games simply focus too much on the game aspects of city builders and not enough on the urban planning. They seem to design game mechanics and simulation parameters based on what seems "fun" instead of reflecting upon a philosophy or theme of what constitutes a city.

Part of the joy of cities is that they are full of stories. Every neighbourhood has a story about how it evolved and grew and all the little things that people do there. Where do people shop? How do they get to work? What do they do for fun? Every city has a different story. But most city builder games have their simulation parameters set in such a way that they can only tell one story. For example, SimCity 4, which I consider to still be the pinnacle of the SimCity series, has its simulation set-up in such a way that you almost inevitably end up with a city that looks like northern California. The simulated residents are highly biased in favour of driving, and you have little ability to influence that. The city simulation is resistant to the levels of densification typical of non-American cities. The simulation doesn't allow farms in built-up areas, but I encountered plenty of urban farms when I lived in Switzerland. Even a basic assumption of the game like the fact that you need to supply water and sewage infrastructure to have even a basic level of housing development isn't actually true. Dubai was able to build many towering skyscrapers that weren't hooked up to a sewage system. All of these sorts of assumptions and fixed parameters in the city simulation constrains what sort of cities can be produced in the game and restricts the types of stories that players can tell. Even worse, SimCity 5 completely abandoned all pretense of accurately simulating a city and embraced purely game-based mechanics for modelling cities.

There's a lot of buzz about Cities: Skylines, which was made by a Finnish developer that previously made some awful transportation simulation games. Their transportation simulator never worked well for trains and buses, and it still doesn't, but they did get it to work well enough for cars that they were able to make a financially successful city simulator. Similar to how the developers built many transportation games that focused on modelling minute details of bus scheduling and bus stops while completely missing the big picture understanding of how mass transit actually works, Cities: Skylines has a detailed city model underneath that provides a simulacrum of a city when viewed at scale, but has no meaningful philosophy in its design and doesn't make much sense when you poke into it. One of the major aspects of the simulation models the player's ability to move dead bodies through the city! I'm currently living in Toronto, and I can't help but think that Jane Jacobs would cry if she know how many YouTube videos there were of people building multi-block, multi-story so-called "optimal" intersections in Cities: Skylines. The sheer prevalence of these videos is a sign that the underlying simulation model and theme of the game is broken. Note to armchair city builders: if you're building a continuous flow intersection in your city, you've already failed.

Of course, it's easy to complain about things. It turns out that I don't really know how to make a better system. Although I think I figured out how to make reasonable transportation models many years ago, I've never figured out how the underlying economic models of city simulators should work. In fact, I'm not entirely sure how the economic models of existing city simulators are designed. As such, it's hard to know what their underlying assumptions are, how they might be wrong, and how they might be fixed. The economic models of games are obviously biased in favour of growth. If a player lays out tracts and tracts of residential zones in the middle of nowhere, people will suddenly build houses in those zones for no apparent reason. Admittedly, in many places in the world, this is a reasonable assumption. In the areas near large, booming, metropolitan centres, if the government were to spend millions to build out sewage, power, and highway infrastructure to an area and then zone it for a subdivision, developers would quickly build tracts and tracts of suburban housing there. And for gameplay purposes, it's important for the city simulation to be biased towards growth because players love the feel of an expanding city where bigger and better things are constantly being built (though playing a dying city where the infrastructure must be slowly rolled back as people move out and where its role has to be reinvented might make an interesting scenario). But is this biasing towards growth done in a heavy-handed way that restricts the ways that a city can evolve or in a subtle way that still lets players design a city the way that they want?

To get a better insight into the way these economic models might work, I dabbled a bit in reading academic papers on urban planning models, but I never could figure them out. I tried out a trick I figured out in high school and tried to find the oldest paper I could find on the subject, and I actually found one that was somewhat comprehensible: Kenneth Train's "a Validation Test of a Disaggregate Mode Choice Model." My takeaway from the paper is that real-world urban planning models are based on polling of a population and building statistical/fitting models of how this population weighs decisions on choices they make on where to live or get around. For people building a computer game simulation, then a micro-economic agent simulation should capture this. Basically, you have a statistical distribution of how for every 100 people, 30 of those people prefer a house with a yard, 20 people choose their home based on the quality of the schools, 35 need to live within 10 minutes of work, and 15 like having a lot of cultural amenities. Then during the game, you randomly generate people based on the statistical distribution, throw them into the city, and have them make individual choices based on their preferences. Then, you just have to choose an appropriate statistical model of people to get the biases you want for your game. In hindsight, this is pretty obvious. If you model a bunch of individual, different people, then in aggregate, you will get an accurate city model. This still left a big problem though. This agent simulation will accurately model the residents in a city, with all of its assumptions explicitly encoded. This approach doesn't really work in modelling a city's growth. Why do people move to a city? How do you bootstrap an initial population for a city that has no buildings, no residents, and no infrastructure? If a game just regularly generates random people based on a statistical distribution and throws them into the city, then the whole simulation is inherently biased towards growth again. It seems like too blunt an approach to the problem. Surely, there must be a more nuanced way of modelling growth that has a better philosophy behind it other than the theme of unlimited growth? Is there a way of modelling growth that provides more adjustment knobs that can be used to encode different assumptions about growth?

I thought about this growth problem for a few years, but I could never make any headway with it. Regularly generating new random people to move to a city in order to get some growth might work for competitive games of SimCity. If there are different cities with different amenities, then depending on how your city compares to others, newly generated people might choose to move to other cities instead of your one. This sort of model might work well for multiplayer competitive SimCity. But I couldn't figure out how a growth model would work for a single-player city building game. I decided that the only way to find a reasonable approach to this problem would be to actually build a small game where I could dig into the details of the problem. Hopefully, after being enmeshed in the details, I would be able to see something that I couldn't see from far away. I came up with a design for a small city simulator that would focus on the economic model (since I felt I already understood how to design the transportation model), and then it was just a matter of finding the time to build it.

Finally, last week on Thursday, I received a last minute e-mail saying a spot opened up at the TOJam game jam that was running during the weekend, so I decided that it was time to dive in. I had worked out a design for a simplified city builder earlier. The city builder would present the side view of a single street. Since the focus was on the economic model and streetscaping and not on transportation issues, there was no need for a full 2d city. Having a side view also meant that the game could have a simplified interface as well that might even work ok on cellphones. In the game, players would place individual buildings and not zones. I think most city builder players like to customize the looks of their cities, but placing individual buildings doesn't work well at a large scale. But on the small scale of a side-view game, I was hoping that placing individual buildings would be feasible. During the first day, I was able to finish coding up a basic UI that would let players plop buildings on the ground and query them. There was a floating artist at TOJam, Rob Lopatto, who drew some amazing pixel art for me of a house and two types of stores, which really looked amazing.



On the second day, I coded up a basic traffic model. Since I was just trying to make something as simple as possible in a limited time, I only modelled people walking between buildings at a fixed speed. Similar to SimCity 1-4, I modelled the aggregate effect of people walking around instead of actually modelling the specific, individual movements of those people on the road. I think the lessons of SimCity 5 and Cities: Skylines is that modelling the movement of individual cars can be slow and leads to strange anomolies, especially when there is extreme traffic. In real-life, during extreme traffic, people shift their schedules to travel during non-peak times or they change routes or they move. It is rare for a traffic situation to become so dire that people end up in multi-day traffic jams and never reach their destinations. The problem with modelling the aggregate effect of traffic is that the simulation simply outputs some traffic numbers for chunks of road. There's nothing to see, and players like seeing little people and cars moving around. So I had to code up a separate traffic visualization layer that would show people moving around proportional to the amount of traffic there is. I wasn't sure if I would end up showing the right traffic is I generated people doing whole trips (my queuing theory is really bad), so instead I used the SimCity 4 trick of randomly generating people to walk around for short sections of road and then have them disappear again. I could then just periodically generate new people on sections of road that weren't showing enough traffic over time in their visualization. Surprisingly, even though my simulation was small enough that I could simulate the whole world, 60 times a second, I still ended up using my geometric series approach in both the traffic visualization and parts of the simulation. It worked really well!

By the end of the second day though, I had a hit a wall. I still couldn't figure out how to model city growth. I could simulate people in the city, but I couldn't figure out how to get new people to move in. I didn't want to explicitly encode a rule for having people automatically move into the city. Perhaps I could come up with some sort of hacky rule for when new residents would be induced into moving into the city. The new rule would likely still have an emergent behaviour of causing an implicit bias towards growth in the city, but if the rule still made thematic sense, then it would be more satisfying and could be tweaked and improved later on. I started leaning towards the idea of using jobs to induce people to move to the city. If there were companies looking to hire in the city, then people would move there. That mostly makes sense, and avoid explicitly biasing the city simulation towards growth.

I still had a bootstrapping problem though. The companies in a city won't hire people unless they have customers and are making money. But if there's no one living in a city, then companies will have no customers and hence have no jobs. I could make companies hire people even when they have no customers, or I could maybe implement a hack where companies might tentatively hire people to see if they can make money and then fire them if it doesn't work out. I think games like SimCity and Cities: Skylines have a hack where cities with small populations have an explicit macroeconomic boost to industrial jobs. If you zone some industrial areas, some companies will move in and create some factories to employ people even if they have no customers and no one lives in the city. This seemed like just another artificial bias towards growth, even if it was in a different form, so I wanted something different.

Instead, I went with a different cheat in that I created a type of building that was self-sufficient and could supply an initial boost of employment without depending on the existence of other people or infrastructure. I opted for subsistence farming plots. They could provide a minimal income even in a city with no other people or infrastructure, thereby attracting an population base. 100-150 years ago, the Americas were settled by offering free plots of farming land to immigrants, so it's not entirely unheard of, though I'm not sure how realistic that assumption would be now. Once the simulated city developed a sufficient population, there would be enough collective demand to make stores or small workshops profitable so they will employ people, resulting is a positive feedback loop of growth. This ends up supporting a theme that a city is dependent on investments of infrastructure to support certain types of economic activity and growth. Or maybe is says that people power a city but it requires infrastructure to improve efficiency and productivity to unlock that power. In any case, I think those are reasonable philosophies around which a city simulation can be designed. I'd be a little bit afraid of making the rules too deterministic so that it feels more like a game than a story-generating city simulator (e.g. you need electricity to have a factory over size 3, you need an outside road link to let your industrial population grow more than 3000 or stuff like that). And there's another a danger of inadvertently building an arbitrary civilization simulator instead (e.g. you need iron mines, coal mines, and an iron smelter to build the steam engine, which is then a prerequisite to industrial age buildings etc). But it does show that this philosophical approach is broad enough to capture many different city models.

On the third and final day, I polished up the UI and tweaked the city model a bit. Since there were only three different building types, and given the shortness of time, the ending city model was still very simple, but it seemed to work well enough, and it helped me work out a different way to simulate growth in a city builder game. Here's an overview of the final simulation code:
  1. People without a job will go to work at the building with the greatest demand for workers (demand must be at least one full worker). Farms always need one worker while stores need workers proportional to the number of visitors/customers they have
  2. People without a home will move to any home they can find
  3. People will move to a home that's closer to their work than their current home
  4. People will visit the closest store to their home to buy things, but if the number of visitors exceeds the store's capacity, the people will move to the next closest store (and so on)
  5. People without homes or jobs will leave the city
  6. People will cause 1 unit of traffic for each square of the street they need to walk on to get from their home to their work
  7. Any building that still has a demand for workers that can't be filled from the local population will hire someone from outside the city (provided that person can find housing)
  8. Every 100 turns, rent will be collected from each building's residents and workers. Upkeep costs for each building will also be deducted.
Here's the final game.

Tuesday, January 24, 2017

Trying Out Some Emscripten on Chrome

Omber is the GWT JavaScript app that I'm currently working on. It runs in a browser, and I've also created an Android version using Cordova. It has some computationally-intensive routines, so it's sometimes a little sluggish on cellphones, which is understandable given how under-powered cellphones are. I've been looking at whether there are ways to improve its performance.

The cellphone version of Omber runs on Chrome (specifically, the Crosswalk version of Chrome). It's unclear how to get optimal performance out of Chrome's V8 JavaScript engine. The Chrome developers talk a lot about how great its Turbofan optimizer is, but they never actually give any advice on how to write your code to get the best code generation from Turbofan. My code does a lot of floating point math, and I really need the numbers to be packed tightly to get the best performance out of the system. Should I be manually using Float64Arrays to do this? Or is V8's Turbofan smart enough to put them directly into objects? Are there ways I can add type hints to arrays and other methods? Can I reduce the number of array bounds checks? In a language like C++, I could simply write my code in a way that would produce the code generation that I wanted, but how do I guide Chrome into generating the code that I want?

Mozilla has their Emscripten project that can compile C++ to JavaScript asm.js style code. Firefox then has a special optimizer for translating JavaScript written in the asm.js style into highly optimized machine code. Personally, I think asm.js isn't a great idea. The asm.js subset is very limiting and sort of hackish. As far as I can tell, the code it produces is not very portable either. Basic things like memory alignment and endianness are ignored or simply handled poorly. For these reasons, most of the other browsers don't support asm.js-specific code optimization, but they claim that their optimizers are so good that their general optimization routines will still get good performance out of asm.js code.

So is it worth using Emscripten or not then? To try things out, I made a small test where I took my polygon simplification code and rewrote it in C++, compiled it using Emscripten to JavaScript, and compared the performance to my original GWT code. I was too lazy to record the actual numbers I was getting during my benchmarking runs, but here are the approximate numbers:

Original code on Chrome: ~280ms
Emscripten code on Chrome: ~230ms
Emscripten code with -O2 on Chrome: ~300ms
Original code on Firefox: ~4000ms
Emscripten code on Firefox: ~160ms
C++ code: ~150ms

Takeaways:


  • The Firefox code optimizer isn't very good, so having a special optimizer for asm.js is really useful for Firefox. Firefox was able to get performance that was pretty close to that of raw C++ code when dealing with asm.js code though.
  • The Chrome optimizer is so good that the performance of the normal JavaScript code is almost as good as the Emscripten code. In fact, it probably wasn't worthwhile rewriting everything in C++ because I could have probably gotten similar performance by trying to optimize my Java(Script) code more
  • Since the Chrome optimizer isn't specifically tuned for Emscripten code, the Emscripten code might actually result in worse performance than JavaScript depending on whether Turbofan is triggered properly or not. For example, compiling Emscripten code with more optimizations (i.e. -O2), actually resulted in worse performance from Chrome
I was a little worried that Chrome's V8 engine might be tuned differently on cellphones, meaning that I might not get similar performance numbers when running on a cellphone. So I also ran the benchmarks on Cordova:


Original code on Chrome: ~2600ms
Emscripten code on Chrome: ~1600ms
Emscripten code with -O2 on Chrome: ~2800ms

Here, we can see that the Turbofan optimizer is still triggered even on cellphones, and the resulting code performs much better than the original JavaScript code. The Turbofan optimizer still isn't reliable though, so you might actually get worse performance depending on the Emscripten code output.

I'll probably stick with the Emscripten version for now, but I'll later try to optimize my original JavaScript and see if I can get similar performance out of it. It would be nice if I could just link my C++ code directly with JavaScript, but Cordova doesn't allow this. In Cordova, all non-JavaScript code must be triggered asynchronously through messages, which isn't a good fit for my application. It might be possible to do something with Crosswalk, but it seems messy and I'm too lazy. 

Alternately, I could try using Firefox on cellphones since its optimizer can get performance that's near that of C++, but the embedding story is a little unclear. The Mozilla people abandoned support for embedding their Gecko browser engine, and they ceded that market entirely to Chrome/Blink. They now realized that it was a mistake and they're trying to get back in the game with their Positron project etc, but I think they've entirely missed the point. They're building an embedding API that's compatible with Chrome's CEF, but Chrome's CEF already works fine, so why would anyone want to use Mozilla's version? The space to play in is the mobile market. Instead of wasting time on FirefoxOS, they should have spent more time working on embedded Firefox for mobile apps. An embedded Firefox for iOS with a JavaScript precompiler would be really useful, and Mozilla could dominate that space. Well, whatever.

Friday, September 23, 2016

It's Impossible to Write Correct JavaScript Programs

During a coding session involving JavaScript, UIs, and new asynchronous APIs, I realized that it's no longer possible to write correct programs with user interfaces any more. The JavaScript language has long had a problem with isolated language designers who tinker on their own isolated part of the system without seeing how things fit together as a whole. Though their individual contribution may be fine, when everything gets put together, it's a mess that just doesn't work right. Then, later generations of designers patch things over with hacks and fixes to "smooth things over" that just make the language more and more convoluted and complicated.

Right now, the language designers of JavaScript are all proud of their asynchronous JavaScript initiatives like promises, async, and whatnot. These "features" shouldn't be necessary at all. They are "fixes" to bad decisions that were made years earlier. It's clear that the designers of these asynchronous APIs mainly do server-side work instead of front-end work because asynchronous APIs make it next to impossible to write correct user interfaces.

In all modern UI frameworks, the UI is single-threaded and synchronous. This is necessary because UI code is actually the trickiest and hardest code to write correctly. People who write back-end code or middleware code or computation code actually have it easy. Their code can rely on well-defined interfaces and expected protocols to ensure the correctness of their algorithms. As long as your code calls the libraries correctly and follows proper sequences, then everything should work fine. By contrast, UI code interfaces with the user. The user is messy and unpredictable. The user will randomly click on things when they aren't supposed to. You might think, how hard can it be to write some code for clicking on a button? But what happens when they start clicking on different buttons with their mouse and finger at the same time? What happens when they start dragging the scrollbar in one direction while pressing the arrow keys in the opposite direction at the same time? Users will drag things with the mouse while spinning the mousewheel and then press ctrl-v on the keyboard and get angry when the UI code becomes confused and formats their hard drive. Then when you fix that problem, some other user will get angry because they were using that combination as a quick shortcut for formatting hard drives and want the old behavior back. Reasoning about the correctness of UI code is very hard, and the only thing that makes it tractable at all is that it's all synchronous. There is one event queue. You take an event from the queue and process it to completion. When you take the next event off the event queue, you don't know what it is, but it will be dependent on the new state of the UI, not the old one. You don't know what crazy thing the user is going to do next, but at least you know the state of the UI whenever the next event occurs.

Asynchronous JavaScript inconveniently breaks the model. All of these asynchronous APIs and promises are based on the idea that you start an action in another thread and then execute some sort of callback when the execution is complete. This is fine for non-UI code because you can use modularity to limit the scope of how crazily the state of the program will change from when you invoke the API and when the callback is called. It's even fine if these sorts of APIs are needed occasionally in UIs. During the rare time that an asynchronous XMLHttpRequest is needed, I can spend the day mapping out all the mischief that the user might do during that pause and writing code to deal with it when the request returns. But these asynchronous APIs are now becoming so widespread that I'm just not smart enough to be able to work out these details any more. The user clicks a button, you call an asynchronous API, then the user navigates to a different view, then the asynchronous call comes back to show its result, but all the UI elements are different now. The textbox where you wanted to show the result is no longer there. So now in your promises code, you have to write all sorts of checks to validate that the old UI actually still exists before displaying anything. But maybe the user clicked on the button twice, so the old UI still exists, but the 2nd asynchronous call returned before the 1st one, so now you need to write some custom sequencing code to make sure things are dispatched in the proper order. It's just a huge unruly mess.

The only practical solution I can find is to suppress the user interface during asynchronous calls, so that the user can't go crazy on the user interface while you're doing your work. This is a little dangerous because if you make a mistake, you might accidentally forget to unsuppress the user interface during some strange corner cases, but dealing with these corner cases is a lot easier than dealing with the corner case of the user generating random UI events while you're waiting on an asynchronous call. There was one proposal to add an "inert" attribute to html to disable all events, but that was eventually killed. Right now, the only hope for UI coders is to misuse the <dialog> tag, but very few browsers support it currently.

The annoying thing is that these things are just sad hacks that make programming more and more convoluted. Despite all the pride that the JavaScript designers have in their clever asynchronous promises API, that too is just a hack to paper over previous questionable decisions. The root cause of all these issues is the arbitrary decision that was made many years ago that there would be no multithreading in JavaScript. As a result, the only way to run something in parallel is to use the shared-nothing Web Worker system to run things in, essentially, separate processes. Although the language designers proudly proclaimed that there would be no concurrency errors because the system didn't allow shared objects or concurrency mechanisms, this system ended up being so limited that no one really used it. There were no concurrency errors in JavaScript programs because no one used any concurrency. (Language designers are now trying to "fix" Web Workers by creating a convoluted API that adds back in shared memory and concurrency primitives, but only for JavaScript code that is translated from C++.) Once JavaScript multithreading was killed, a certain old dinosaur of a browser company (no, not Microsoft, I meant dinosaur literally) discovered that their single-threaded browser kept hanging. Although every other browser maker moved to multi-process architectures that ensured that browser remained responsive regardless of the behavior of individual web pages, this single-threaded browser would become unresponsive if any tab made a long-running synchronous call. Somehow, the solution to this problem was to remove all synchronous APIs from JavaScript. And now we can't write correct UI code in JavaScript any more.

JavaScript is getting to be a big mess again. The fact that it's no longer possible to write correct user interface code any more is a clear signal that something has gone wrong. The big browser vendors need to call in some legendary language gurus to rethink the language and redirect it down a more sane path. Perhaps they need to call in some academics to do some original research work on possible better concurrency models. This has actually happened in the past, when Guy Steele was brought in for the original JavaScript standardization or when Douglas Crockford killed ES4. It looks like something like that is needed again.

Sunday, January 10, 2016

Java Metaprogramming Is Widespread But Slowly Dying

Metaprogramming is one of those cool academic topics that people always talk about but never seem all that practical or relevant to real-life programming. Sure, the idea of being able to reprogram your programming language sounds really cool, but how often do you need to do it? Is it really that useful to be able to change the behavior of your programming language? How often does a programmer need to do something like that? Shouldn't you be able to do everything in the programming language itself? It seems a lot like programming in Haskell--technically cool, but totally impractical.

I've recently started realizing that metaprogramming features in programming languages aren't important for technical reasons. Metaprogramming is important for social reasons. Metaprogramming is useful because it can extend the life of a programming language. Even if language designers stop maintaining a programming language and stop updating it with the new features, metaprogramming can allow other programmers to evolve it instead. Basically, metaprogramming wrestles some of the control of a programming language away from its main language stewards to outside programmers.

One of best examples of this is Java. Traditionally, Java isn't really considered to have good metaprogramming facilities. It has some pretty powerful components though.
  • It has a reflection API for querying objects at runtime. 
  • It has a nice java.lang.reflect.Proxy class for creating new objects at runtime. 
  • By abusing the classloading system, you can inspect the code of classes and create new classes. 
  • The JVM instruction set is well-documented and fairly static, making it feasible for programs to generate new methods with new behavior. 
The main missing pieces are
  • The instruction set is so big and complicated that it's cumbersome to analyze code or to generate new methods
  • You can't really override any of the JVM's behaviors or object behaviors
  • You can't really inspect or manipulate the running code of live objects
The crowning piece of the Java metaprogramming system though is annotations. To be honest, most of the real metaprogramming stuff is too complicated to figure out. Annotations, though, are simple. It's just a small bit of user-specified metadata that can be added to objects and methods. Its simplicity is what makes it so powerful. It's so simple to understand that many programmers have used annotations to trigger all sorts of new behaviors in Java. Annotations have been used and abused so much that their use is now widespread throughout the Java ecosystem. This type of metaprogramming is probably the most used metaprogramming facility in programming languages right now. 

I believe that metaprogramming through annotations has allowed Java to evolve and to add new features despite long periods of inactivity from its stewards. For example, during the 10 years between Java 5 and Java 8, there weren't any major new language features to the Java language. While Java was stagnating during that period, other languages like C# or Scala were evolving by leaps and bounds. Despite this, Java was still considered competitive with others in terms of productivity. One of the reasons for this is that Java's metaprogramming facilities allowed library developers to add new features to Java without having to wait for Java's stewards. Java gained many powerful new software engineering capabilities during those 10 years that put it on the leading edge of many new software practices at the time. Metaprogramming was used to add database integration, query support, better testing, mocking, output templates, and dependency injection, among others, to Java. Metaprogramming saved Java. It allowed Java to be used in ways that its original language designers didn't anticipate. It allowed Java to evolve and stay relevant when its language stewards didn't have the resources to push it forward.

What I find worrisome, though, is that the latest language developments in Java are weakening its metaprogramming facilities. Java 8 weakened metaprogramming by not providing any reflection capabilities for lambdas. Lambdas are completely opaque to programs. They cannot be inspected or modified at runtime. From a functional/object-oriented cleanliness perspective, this is "correct." If an object/function exports the right interface, it shouldn't matter what's inside of it. But from a metaprogramming perspective, this causes problems because any metaprogramming code will be blind to entire sections of the runtime. Java 9 will further weaken metaprogramming by imposing extra visibility restrictions on modules. Unlike previous versions of Java, these visibility restrictions cannot be overridden at runtime by code with elevated security privileges. From a cleanliness perspective, this is "correct." For modules to work and be clean, normal code should never be able to override visibility restrictions. The problem is that the lack of exceptions hampers metaprogramming. Metaprogramming code cannot inspect or alter the behavior of huge chunks of code because it is prevented from seeing what's happening in other modules. 

Although its great to see the Java language finally start improving again, the gradual loss of metaprogramming facilities might actually cause a long-term weakness in the language. As I mentioned earlier, I think the benefits of metaprogramming are social, not technical. It's a pressure valve that allows the broader programming community to add new behaviors to Java to suit their needs when the main language stewards are unable or unwilling to do so. With the language evolving relatively quickly at the moment, it's hard to see the benefits of metaprogramming. The loss of metaprogramming features will be felt in the future when outside developers can't extend the language with experimental new features and, as a result, the language fails to embrace new trends. The loss will be felt if there's ever another period of stagnation or conflict about the future direction of the language, and outside developers can't use metaprogramming to independently evolve the language. Hopefully, this gradual loss of metaprogramming support in Java is just a temporary problem and will not prove detrimental to the long-term health of the language.

Wednesday, December 09, 2015

Transcoding Some Videos

One of my websites has some videos on it, and I usually just embed some YouTube videos there. You don't have to pay for hosting it, YouTube takes care of encoding the videos so that it can be used on multiple devices, and you can potentially get some views from people searching for stuff on YouTube. But recently, I've started to get concerned about embedding third-party widgets like that. It's a little unclear how compliant my website can be with its privacy and cookie policy if these third party widgets can change their own cookie and privacy policies at will.

So I looked into what's involved in hosting the videos myself. It turns out the hit from hosting the videos myself wouldn't be too bad. Since the videos were video slideshows, it turns out they actually compress really well. I played with different ffmpeg, and I found that I could drop from the 50MB files that my video program produced to 5MB files by using two passes, variable bit rate, and a large maximum duration between key frames.

Now the second problem. There are two main video formats on the web: webm and mp4. Apple owns patents on mp4, and they purposely refuse to support any video formats except mp4 on their devices so that anyone who wants to provide video content to Apple users must pay for Apple's patents. I couldn't just use ffmpeg to transcode my videos to mp4 format because it doesn't come with a proper patent license (licenses are needed to decode and encode the h.264 video and AAC audio). I tried scouring around the Internet for a properly licensed version of ffmpeg that I could buy, but I had no luck in this. I could have just purchased a whole new video program with its codec packs, but it's hard to tell whether the codecs that come with a video program would expose the tuning parameters I needed to get the small sizes I wanted.

In the end, I went with a cloud transcoder since they presumably purchase a patent license for their services. It turns out most of the cloud transcoding services have gone bankrupt, so there's only a few big ones left like Amazon Elastic Transcoder, Zencoder, and Telestream cloud. Initially, I was leaning towards Zencoder because they were pretty upfront about the fact that they let you set all the ffmpeg parameters yourself and they said they 2-pass encoding. But the system seemed sort of messy--you needed to copy your files into s3 and give them read rights to it. At that point, it seemed easier just to go with Amazon since I already had an account with them. At first, I couldn't get the Amazon stuff to start, but apparently, the Amazon Transcoding console won't start until you upload a video file to s3 first, which is a little bizarre, but whatever. In the end, Amazon actually exposed the parameters I needed to get my slideshow to compress well, and the final file sizes seemed to be competitive with what I was getting from 2-pass encoding with ffmpeg myself, so I suspect that Amazon must be enabling that but not saying it in their documentation. The web interface requires you to manually enter in the settings for every single file you want to transcode, which is a pain, but I only had 15 videos or so, so it wasn't too bad. It's possible to script the transcoding using their APIs, but I was too lazy to do that.

Actually serving mp4 videos on a website also requires a patent license (separate from the patent license for encoders and decoders). Fortunately, I was offering free educational Internet videos, and those are exempt from royalties.

It's sort of annoying that the technical aspects of putting a video up on my website only took me an hour or two to figure out, but the process of trying to figure out how to do so legally ended up taking two days. I really hate how Apple is doing everything possible to sabotage the web and extract maximum profit from it--they patent important parts of the HTML specification, they refuse to support formats that can be used without patents, they refuse to support new standards in their browsers if it makes them competitive with apps--it's just ridiculous sometimes.

Tuesday, November 24, 2015

How John Tory Can Get Out of Building SmartTrack While Still Building SmartTrack

When Mayor John Tory was campaigning for his position, a key part of his platform was that he would solve Toronto's transportation issues by building a transit system called SmartTrack. Unfortunately, SmartTrack never made much sense as a transit plan: it's expensive for the limited transit benefits it provides; it's unlikely to deliver any of its promised benefits; and it's not actually within the the mayor's powers to build it. In fact, the majority of Toronto voters voted for candidates who wanted to build an alternate transit plan, the Downtown Relief Line, instead. But as a major campaign promise, he has to deliver something, but it doesn't make sense for Toronto to waste money building SmartTrack. A wily politician would be able to get out of that promise without wasting all that money. But how can John Tory "build" SmartTrack without actually building it? Or, alternately, how can John Tory get out of spending all the money and political capital needed to implement his SmartTrack plan while still being able to face voters at the next election and claim that he's building it?

This blog post will look at what SmartTrack is, why it won't work, and how John Tory can get out of building it.

Why People Want SmartTrack

On it's surface, the SmartTrack proposal sounds pretty promising. SmartTrack is supposedly able to provide a fast, frequent, high capacity train service that covers most of the city and links together several major employment centers in the GTA such as downtown, business parks near the airport, and business parks in Markham. By taking advantage of existing railway tracks and unused lands throughout the city, the system can supposedly be built quickly and affordably. Who wouldn't want something like that? If you can build a useful transit service for not a lot of money, why woudn't you do that?

The full system is 53km long and is comprised of 22 stations. It runs along "unused land" between the Airport Corporate Centre eastwards towards the Kitchener GO train line. From there, it follows the same path to Union Station in downtown. From Union Station, SmartTrack would extend north-east along the same path as the Stoufville GO train line up to the in-development Markham downtown. The whole plan would supposedly only cost $8 billion dollars and be built in under seven years.

Why SmartTrack Doesn't Work

When SmartTrack was proposed, many people were confused because transit planners had never proposed building such a system before. Since the proposal was new, no one had actually studied whether it would be possible to actually build it, so no one could intelligently argue against it. On the surface, it seems like it could be feasible. Don't we already have train tracks running through Toronto? Surely, we could just build a bunch of stations and run a service on them?

In reality though, the reason that no transit planner had ever proposed such a system before was that it wasn't that useful and it's much more difficult to build than suggested. No one had done a formal study of the issue, so no one could authoritatively criticize of the project. But just by looking at maps, looking at ridership numbers of existing services, and by listening to details that transit planners have made in the past about the capacity of the existing train track, it was pretty clear that SmartTrack would not be an easy system to build and run. I suspect that the transit planners for the provincial government could have easily rebutted the claims made about SmartTrack, but they were told to keep quiet so as to not interfere with the election.

So why isn't SmartTrack feasible? Well, let's look at its promises:

  • Lots of Stations: One of the claimed benefits of SmartTrack is that there would be lots of stations around the city where people can get on the train. The problem with having lots of stations is that when a train is stopped at a station, no other train can pass by. On a subway or LRT, this isn't a problem, but SmartTrack runs along other people's train tracks, and those owners won't be happy if their trains have to stop every few hundred metres while the SmartTrack train pulls into station after station. SmartTrack probably can't get approval for building so many stations unless they also build a lot of extra traffic so that SmartTrack trains don't interfere with existing trains using the track.
  • Frequent Service: SmartTrack supposedly will offer "frequent" service. When people think of frequent service, they usually think of a subway-like service that comes every 5 minutes. In reality, SmartTrack would at best be able to offer service every 15 minutes, and service would most likely only be able to reach 30 minute intervals. If you have a choice between waiting 30 minutes for a train or just taking the local bus that comes every 5 minutes, most people would rather take the bus. The reason that SmartTrack is so infrequent is that the train tracks have limited capacity. Just because a train track exists, doesn't mean you can just run an infinite number of trains on it. Trains take a while to speed up and slow down, so you need to carefully manage the trains to prevent them from colliding into each other. Although it's possible to increase the capacity of the system through electrification, improved signalling, better train management, and building more track, these aren't straight-forward changes to make. The other problem with frequent service is that running a frequent service is expensive, and there likely isn't enough demand to justify running that many trains. This is discussed in more detail later on.
  • Fast: SmartTrack is supposedly faster than other transit alternatives because it runs on its own train track and doesn't have to worry about traffic lights or car traffic. Although that is true, the SmartTrack train has a lot of train stations. Because trains have steel wheels, they don't have much traction, so they are slow to speed up and slow down. The more stations there are, the more time it has to spend slowing down at each stop, waiting for passengers, and then speeding up again. With so many stations, SmartTrack will likely be a lot slower than promised. It will almost definitely be slower than driving.
  • Cheap to Build: Because SmartTrack runs along an existing rail corridor and other unused land, it will supposedly be cheap to build. If you don't need to build new tunnels or bridges, then it should be pretty cheap to build, right? The SmartTrack plan says that it can be built for only about $8 billion (still a HUGE sum of money). The problem, though, is that the existing rail corridor might not have the capacity to handle all the SmartTrack trains, so to make room for the SmartTrack trains, you would have to build a lot of new tunnels and bridges. The railway corridor on the eastern leg of SmartTrack only has a single track, so to expand it to support SmartTrack will require expropriating land, adding extra track, and building new tunnels and bridges when it crosses roads. The western leg of SmartTrack is already jammed with trains, so new track might need to be built there. The western track extension to the airport is supposed to run on unused land, but that land is now being used by condo projects. The downtown leg of SmartTrack is so near to capacity that the province was thinking of diverting trains to an alternate train station or building a giant tunnel in the future. Adding SmartTrack to downtown could exceed the capacity of those lines and force the building of those expensive projects.
  • Useful: Well, SmartTrack might be expensive and might not be as quick or as frequent as promised, but it would still be a nice service to have, right? True, but SmartTrack will be an expensive service to run, and it's not clear how many people will actually use it. GO Transit already runs trains along that route. Although it doesn't have that many stops and doesn't come too frequently, we can look at its ridership numbers to give us an idea of how much demand there might be for train service along that route. Those trains carry the highest demand part of the line: passengers commuting to downtown during peak periods. The reality is that there isn't that much demand. Although traffic in the northwest and northeast parts of the city isn't great, it still usually makes more sense to drive there than to take the train. Those parts of the city were specifically designed for driving and have a decent road system. Is it worthwhile spending hundreds of millions of dollars a year to run a service that won't be used by that many people? 
  • Quick to Build: One of the strangest parts of SmartTrack is that John Tory was promising to build it at all. It's strange because it's not within his power to build it. SmartTrack will run along a rail corridor that belongs to the province and some private companies. It's a variation of an existing train service that is owned and run by the province. The bulk of the financing is supposed to come from the federal and provincial governments. It's not clear what the city would contribute and how it would be within its power to build and run such a service. It would be as if John Tory made a campaign promise that Air Canada would run more frequent flights between Toronto and Windsor, using funding from the federal government. Although more frequent flights would be nice for the city, the mayor has no influence over Air Canada or the federal government. How could he make a promies on behalf of someone else?
The main problem with SmartTrack though is that it has been made completely redundant by the province's plans for a Regional Express Rail train running along mostly the same route. The Regional Express Rail train provides many of the same benefits of SmartTrack but is much cheaper (in fact, the province does not expect any financial contributions from the city beyond, perhaps, moving some utilities).

How to Get Out of Building SmartTrack

Given all the problems with the SmartTrack proposal, how can the mayor get out of building it? With the right messaging, it shouldn't be too hard. The province has seen that the mayor has painted himself into a corner and has provided him with ways to make a face-saving exit. But the mayor seems oblivious to this and is mishandling his communication in such a way that he can't change direction.

The province has seen the mayor's problems, and they have started building their own transit system that provides 70-80% of the benefits of SmartTrack at absolutely no cost to the city of Toronto. The Ontario government's Regional Express Rail project was originally supposed to only serve the west of the city. It provides fast service to a smaller number of stations along the same route as SmartTrack. In light of SmartTrack, the provincial government has decided to extend it to the east of the city along the same route as SmartTrack (even though existing ridership numbers didn't justify the building of such an extension) and they decided to add several new stations. They're also strongly considering the electrification of the tracks even though earlier studies suggested that it would be more beneficial to electrify a different set of tracks. Regional Express Rail makes SmartTrack a redundant transit service. Why spend money building SmartTrack if it duplicates an existing transit service? If John Tory were to do absolutely nothing, then he would get most of the benefits of his system without having to spend any money or political capital. The province will build it for him. But if he does nothing, he looks like he's abandoning his campaign promise. How can John Tory back off from building SmartTrack without looking like he's abandoning his campaign promise?

It's all about messaging. Instead of focusing on SmartTrack as a specific plan involving 53km of track and 22 stations, John Tory can redefine SmartTrack so that it can include the Regional Express Rail. He can define it as a plan to leverage Toronto's existing rail corridors to help move  Torontonians through the city. Instead of specifically requiring a heavy rail link to the airport business centres, he can just say something like, "we need to find a way to connect downtown with other important employment centers throughout the city, including Markham and the airport area." Instead of requiring there to be 22 stations, he can just say that that the existing rail corridors don't serve Torontonians well because there aren't enough stations. Instead of SmartTrack being a specific transit plan, he can describe SmartTrack as being "smart" about taking advantage of Toronto's existing infrastructure to quickly build new infrastructure connecting as much of the city as possible. In particular, instead of getting the city's planners to study the building of the specific SmartTrack plan (a mistake he already made, unfortunately), he should have told the city's planners to come up with a plan that would leverage Toronto's existing rail corridors to better connect downtown with with other employment centres thoughout the city. He should define SmartTrack in terms of the outcomes and the benefits it will provide instead of implementation. He should focus on the ends, not the means. That way, any transit plan that delivers the same benefits can be labelled as "SmartTrack." By defining SmartTrack more generally, it gives him more leeway to alter the plan to accomodate the realities on the ground.

It also allows him to build something cheaper while still "building SmartTrack." For example, building a light rail to the airport is much cheaper and more appropriate than an underground heavy rail line. By defining SmartTrack in terms of "connecting other employment centers to downtown," then he could build credibly build a light rail line while still claiming it to be part of SmartTrack. By defining SmartTrack as "leveraging the existing train tracks that cross the city to provide better transit service for Torontonians" then he could get the TTC to pay the province to build more Regional Express Rail stations in Toronto and to let TTC riders ride it while paying a regular TTC fare. The outcome is the same, and he can apply his political pressure to ensure that the final Regional Express Rail system is good for Torontonians, but the actual financial and political cost is much less. And at the end of the seven years, he can still take credit for "building SmartTrack" even though the final system might not be exactly what he promised on the campaign.

The original SmartTrack plan promised by John Tory has been made redundant by other transit systems being built by the province. He needs a way to back out of those plans without looking like he's abandoning his promise. He can do that by redefining SmartTrack in terms of its outcomes instead of its implementation. By describing SmartTrack in terms of how it will help Torontonians move through the city instead of as a specific set of stations and train lines, he gains the flexibility needed to adapt the plan to the changing circumstances.

Friday, July 03, 2015

UIs and Layout Managers Using HTML and CSS

For the past few years, I've decided to stop learning new UI frameworks and to make all my user interfaces using HTML5. Making user interfaces is HARD. The idea that I should constantly throw away my old UI code and rewrite things in scratch every few years using new, half-baked UI frameworks is preposterous. It takes ages to learn the ins and outs of a UI framework and figure out how to  get the behaviour "just right." Why would I want to discard code that works perfectly well and which I spent ages fine-tuning and replace it with new code based on a new, buggy UI framework? Since I do all my coding in Java, I went and ported GWT Elemental to JavaFx so that I could use HTML5 in my UIs from my Java code. I can now take my same UI code and reuse it on websites, on desktop applications, and for mobile UIs.

In the past, I've found using HTML for user interfaces to be problematic. HTML has traditionally used a word processor layout model. There's a central flow of text, and you can position pictures and other elements to the sides of the text. HTML really wants you to lay things out this way. If you try to do something different, you end up really fighting against the layout model and causing yourself grief. The standard components of a desktop UI -- widgets and toolbars that dock on the sides and status bars on the bottom -- really doesn't fit in well with HTML layouts.

HTML also often works at the wrong abstraction level for making good UIs.
  • the UI engine has to be on guard against exploits by non-trusted code, so you can't easily capture the mouse or manage the clipboard or talk to other applications, etc
  • you can't really do pixel fiddling. Sometimes, you just want to get in there and just tweak the pixels to get the perfect look, but since HTML is a retained mode UI, you can't easily do that. In the end, that has worked out ok because it made adding support for high dpi screens fairly painless. But you can't do things like make rounded buttons that have pixel-perfect shading without lots and lots of hoops to jump through. You can't take existing widgets and buttons and tweak the look a little bit by fiddling with the pixels. If you want to fiddle pixels on a button, you have to write all the logic for the button yourself (some of the accessibility stuff can get hairy!). You can't take an existing button and just fiddle with how it gets painted.
  • HTML has poor support for text input and internationalization. Even after many years of studying it, it's still unclear to me how to do rich-text internationalized input in a web browser. Maybe it's easy, maybe it's not. There's just not much talk about it.
  • HTML doesn't really have a concept of widgets. This is coming in the form of web components, shadow DOM and templates, but these things are very much a work in progress and it isn't clear when they'll be available for widespread use. In the meantime, HTML doesn't really support the idea of having self-contained UI components. If you make a custom UI widget, the "guts" of your widget are exposed in the HTML. Other components might accidentally, move things around in your widget or restyle their CSS because there is no way to modularize your own code to prevent accidental tampering by other widgets.
  • the event model doesn't have easy hooks for doing common UI stuff like keyboard shortcuts, context menus, menu bars, enabling/disabling widgets, modal dialog boxes, file choosers, etc. Handling these things require awkward flows of events, so regular UI frameworks like win32 or Swing have special hooks that allow you to tap into this event flow without having to build your own convoluted event handling framework
  • it's hard to lay things out at their "natural size." If you have a short form that you want people to fill in, it can get a little tricky to set its width and height set to the minimum size needed to hold the form. Often you simply need to guess at an appropriate size.
  • since HTML is designed for making web pages, it does a poor job exposing platform dependent behaviour to the application. What's the default font on the system? What's the default keyboard button used for keyboard shortcuts? What keyboard events is it safe to intercept without destroying accessibility of the platform? What's the default language?
  • you have to code the common UI widgets yourself because HTML doesn't come with any. Things like toolbars, menu bars, context menus, spin buttons, and scrollbars are all things you have to do yourself.
Despite all these major deficiencies in using HTML for traditional UIs (and I'm sure there's many more too), HTML does have many advantages over other UI frameworks.
  • there are many more developers working on improving HTML5 than there are developers working on other UI frameworks, so it advances quickly
  • it's well-supported on new hardware and is easily cross-platform
  • it embraces certain features much earlier than other UIs (e.g. touch support and high dpi)
  • it has easy support for printing
  • it ages well, so old HTML code generally still works even on modern systems
So given that we want to use HTML for a traditional UI, how do we go about doing it? In the last few years, the layout options available using HTML and CSS have improved dramatically with endless new features that cater to people designing UIs as opposed to word processor print layouts. With all of these features though, it has taken me a while to figure out how to use those options to make a traditional looking UI. Here are some of the tricks that I've used.

The first thing is to make sure to zero out the margin, padding, and borders of all your html, body, div, and span elements. In the past, it was also necessary to set the height and width of the html and body elements to 100%, but I don't think that's necessary any more. I also don't think it's necessary to add "position: absolute;" or "position: relative" on the html and body elements any more. This is all necessary so that you can accurately stick things in the corners and sides of the page using absolute positioning. In a word processor layout, you want to have a margin on the sides, but in a proper UI, you want to have toolbars and menus there.

In the past, it was important to avoid using pixels for positioning because people with poor eyesight would increase the font size to make things easier to read. Most designers couldn't handle this, so the modern approach is to use pixel positioning, but let users with poor eyesight adjust what the size of a pixel is. I'm a traditionalist though, and I still try to use layouts based on font size where possible while resorting to pixel sizes when I actually need containers that hold images with a known pixel size. HTML5 now has new measurement units that make laying out resizable things easier.

The "rem" unit is the width of an "M" character on the body element. You can lay things out based on how many characters should fit in a certain area. Unlike the old "em" unit, which is the width of an "M" for the current element, you don't have to worry that you might be nested inside another element that changed the size of the font or something.

Similar to the "rem" unit, HTML5 also has the new measurement units "vw" and "vh", which express things in terms of percentage of the viewport width and height (i.e. width and height of the browser window). If you want your UI to resize when you resize the browser, then you need to express things in terms of percentages. Unfortunately, the old "%" unit was always a little confusing because it sized things in terms of percentage of the parent (and sometimes, it was not of the parent but of the first relatively or absolutely positioned parent). Often, you need to position div elements inside other div elements to get the right layouts, but you still wanted things to resize globally, so using "vw" and "vh" units lets you do that. There's even a "vmin" unit that's useful for making elements that have a certain ratio of height to width but that still resizes when you resize the browser. I suspect that "vw" and "vh" units might have similar problems to "%" when using things for widths. Sometimes, when you set two things with a width of "50%" beside each other, the actual size in pixels might be something like 500.5, and the browser might round those values up or down, meaning the final width might leave an extra pixel somewhere or it might overflow the width of the browser. I think modern browsers actual use floating point numbers for sizes and are a bit more generous about half pixels at the edge of the screen because I haven't had an issue with things like that in a while. There's also a "vmax" unit. That unit might be useful for scaling images when used in combination with max-width and max-height. I haven't had an occasion to use it yet though.

Using physical units like inches and picas are still ill-advised, I think. In the past, there was an issue where some browser makers would actually use real units there. So if you said you wanted something to be one inch, but you were using a 60inch TV, the browser would actually make your element only a few pixels wide because that was what one inch was on a large TV (whereas on a tiny mobile phone, one inch might be half the screen). I think most browsers just set one inch to be 96 "pixels" now, but if that's the case, you might as well just use pixels directly for sizing things.

Once you have your units figured out, you need to a way to stick things in different places in the window in order to make a traditional UI. To do that, you can use "position: fixed" or "position: absolute".

fixed positioning

In the past, Apple sabotaged fixed positioning because the iPhone wouldn't follow it, but modern iPhones do behave properly now. I just find absolute positioning to be more flexible and easier to use though. If your UI has a central resizable area, but some fixed sized things on the side, then you could possibly use fixed positioning. The scrollbars for the whole web page will only control the central area, but that might be what you want. When using a keyboard to control a UI, this is useful because using the the cursor keys to move around will always scroll the central area even if the keyboard focus is on one of the side panels. With absolute positioning, for example, your keyboard focus might be on a side panel when you first create your UI, so when the user tries using the cursor keys to scroll things, the central area won't scroll, contrary to their expectations. But this is fixable, so I'm not sure it's worth using "position: fixed" just for that. People might get confused by having the main scrollbar only control the central area too.

Funnily enough, in the past, Apple also sabotaged absolute positioning on the iPhone too. Sometimes, you wanted side panels in your UI that scroll, and the iPhone wouldn't show scrollbars on them, so people wouldn't realize that they scroll, and the iPhone gesture needed to actually scroll them was really confusing (some sort of two finger thing). That's fixed now though.

floating panel

Absolute positioning is obviously useful for free-floating toolboxes and windows, but you can also use it in UIs to provide the functionality of a BorderLayout layout manager. You can easily create one expandable center area, with fixed sized components above, below, to the left, and to the right of it. Unlike fixed positioning, elements with absolute positioning can be nested, so any element or UI component can, in turn, use absolute positioning to layout out its internal elements using this border-style layout. HTML's absolute positioning is also limited because you have to specify sizes for the elements that you place on the sides. You can't let those components be laid out "naturally" and let the UI automatically figure out a natural width and height for them. You must explicitly give them a size.

To use absolute positioning properly in this way, you need to watch for some things:
  • By default, the width and height of an element given in CSS specifies the size of the content only and does not include the border and padding. This makes it hard to get boxes to line up properly beside each other because the size of an element is often given in different measurement units from the size of its border and padding. Typically, you would specify the width of an element in vw, its padding in rem, and its border in px. In the past, you would need to get around this problem by using nested div elements, but CSS now offers two better ways to deal with this problem. One is the calc() function in CSS that lets you calculate a measurement that mixes different measurement units. This is still a bit of a pain to use though, so the easier approach is to use the CSS "box-sizing: border-box" property to explicitly state that measurements should include the border and padding.
  • You have to make sure that you've positioned everything perfectly to fit inside the browser window, or you'll end up scrollbars on the browser, which will throw the whole layout off. Sometimes, it's useful to sprinkle "overflow: auto;" and "overflow: hidden" on various elements to make sure content doesn't accidentally spill over and become larger than the browser window, triggering the appearance of scrollbars
  • If you've worked with other UI frameworks or even drawing frameworks like the HTML5 Canvas, you get into a habit of specifying the sizes and positioning of things using left, top, width, and height. With absolute positioning, this can get you into trouble because you have to mix different measurement units, so things can get confusing really quickly. To get the most use out of your absolute positioning, you have to remember that CSS lets you specify the sizes of things in terms of right too (i.e. distance from the right side). So a sidebar on the left can be positioned using "left: 0; width: 20rem;", a sidebar on the right can be positioned using "right: 0; width: 30vw;" and the central array that expands as the window is resized can be positioned using "left: 20rem; right: 30vw;". Notice how a width isn't even specified for the central area. It's size is specified by simply giving the positions of its left and right sides, and different measurement units are used for the two sides too.

using absolute positioning for a border layout

Recent web browsers also support new layout tools such as flexbox. Flexbox is nice because it lets you do some nice things like vertical centering, aligning elements, mixing of different measurement units and making some limited use of natural sizes of elements when doing layout. Unfortunately, there's a lot of knobs that you need to adjust to get the flexbox to work, and those knobs have confusing names so I always forget what they are and have to spend a lot of times looking things up every time I want to use a flex box.

I sometimes end up using flexbox layouts for really mundane things that should be easy in CSS, but that I always forget how to do, like making a line of boxes or a line of images. I always forget to set the vertical-align property on those boxes, so they end up being positioned inconsistently depending on what their contents are. Flexbox uses its own alignment rules, so you can avoid that whole mess.

In the future though, I'm eagerly awaiting the arrival of grid layouts to CSS. Although you can sort of do the same thing using tables, grid layouts should provide much more layout power than flexbox while reducing the amount of confusing HTML verbiage you need to write. With grid layouts, you can actually align things both horizontally and vertically! And you don't need to put elements in your HTML just to designate how things should be laid out. You just specify the different pieces of content you want in HTML, and then the CSS is used to position them in a grid. There's a bit of a concern that Apple has no desire to add support for grid layouts to Safari, but hopefully, they'll be swayed in time.