Programmers like to talk about scaling and performance. They talk about how they made things faster, how some app somewhere is hosted on some large number of machines, how they can parallelize some task, and so on. They particularly like to talk about techniques used by monster sites like Yahoo, Twitter, Flickr, etc. Things like federation, sharding, and so on come up regularly, along with talk of MogileFS, memcached, and job queues.
This is lot like gun collectors talking about the relative penetration and stopping power of their guns. It’s fun for them, and there’s some dick-wagging involved, but it doesn’t come into practice all that much.
Most programmers are working on projects where scaling and speed just aren’t all that important. It’s probably a webapp with a database backend, and they’re never going to hit the point where any “standard’ component becomes an insoluble bottleneck. As long as the app responds “fast enough”, it’s fine. You’ll never need to handle thousands of request per minute.
The thing that developers usually like to finger as the scaling problem is the database, but fixing this is simple.
If the database is too slow, you throw some more hardware at it. Do some profiling and pick a combination of more CPU cores, more memory, and faster disks. Until you have to have more than 8 CPUs, 16GB RAM, and a RAID5 (6? 10?) array of 15,000 RPM disks, your only database scaling decision will be “what new system should I move my DBMS to”. If you have enough money, you can just buy that thing up front.
Even before you get to the hardware limit, you can do intelligent things like profiling and caching the results of just a few queries and often get a massive win.
If your app is using too much CPU on one machine, you just throw some more app servers at it and use some sort of simple load balancing system. Only the most brain-short-sighted or clueless developers build apps that can’t scale beyond a single app server (I’m looking at you, you know who).
All three of these strategies are well-known and quite simple, and thus are no fun, because they earn no bragging rights. However, most apps will never need more than this. A simple combination of hardware upgrades, simple horizontal app server scaling, and profiling and caching is enough.
I’ll be the first to admit that DateTime is the slowest date module on CPAN. It’s also the most useful and correct. Unless you’re making thousands of objects with it in a single request, please stop telling me it’s slow. If you are making thousands of objects, patches are welcome!
But really, outside your delusions of application grandeur, does it really matter? Are you really going to be getting millions of requests per day? Or is it more like a few thousand?
There’s a whole lot of sites and webapps that only need to support a couple hundred or thousand users. You’re probably working on one of them ;)