Yii, Heroku, and The Asset Pipeline
Update 6/20/2013: I've since switched away from this strategy. Read more about the new and vastly improved strategy here.
I'm building a SaaS platform (Mantis) in the PHP framework Yii and after some experimentation, I've settled on Heroku as my host. (I also tried AppFog and setting it up myself on Amazon.) I've run into a few snags along the way that I think others may run into as well, and hopefully I can save someone out there several hours of toil.
Because Heroku is hosted on top of AWS EC2 servers, there is no persistent filesystem to which you can write. There's a temporary "scratchpad" that you can use, but nothing that sticks around. In a typical Yii application, the assets are published to a folder in the webroot called "assets" the first time the application is loaded or the first time that asset is requested (http://www.yiiframework.com/wiki/148).
On Heroku, since the filesystem is ephemeral, those assets get blown away with every code push or every time the dyno switches and are not visible between different dynos. (http://www.quora.com/What-are-the-potential-downsides-of-using-Heroku.) Bummer. That means that we can't use the typical Yii publishing mechanism, because the assets would have to continually be republished after the dynos are switched. There is, however, a silver lining, which is this: static assets really should be served from a cookieless domain, eg: Amazon's S3, according to Google's best practices guide on speed.
The Solution... And Some More Problems
To address the filesystem issue and needing to get the static assets off to a cookieless domain, I found a great Yii extension that will push all your assets to S3 and can optionally put your Cloudfront domain in front of it. Although that seems like a pretty straightforward solution, we run into a few more issues here.
The extension pushes all the assets to a bucket on S3 and then stores that information in a cache (of your choosing). Later when an asset is requested, the S3 extension checks the cache to see if that asset has been published to S3. If it has, it returns the URL; if it hasn't, it publishes the asset and then returns the URL. That can take a long time, especially if you have a lot of assets. We're talking 30 seconds or more on first request as it publishes all your assets. That's no big deal on a staging server, but unacceptable on a production environment.
The ideal situation would be for the staging server to publish all the assets to S3, and the production server to know about it. That would mean that the staging and production servers would have to share a cache: really bad plan.
Heroku offers several addons for caching, including memcache and IronCache, amongst others. The Memcache addon on Heroku depends on SASL authentication, which Yii doesn't offer out of the box. I wrote a MemcacheSASL extension for Yii and am using that as my main application cache, with complete separation between staging and production, ne'er the twain shall meet.
We still need to get a cache for the list of published assets though... The S3 extension allows you to declare the caching component used, which means there is no need to use the system-wide cache (memcache in my case). For the S3 cache, which will be shared between the production and staging servers, I decided to go with IronCache. The only thing that will be stored in IronCache is asset published/not-published data. All other caching is done in memcache. Again, there was no IronCache component for Yii, so I wrote one of those too.
We're still not quite out of the woods, unfortunately. The S3 component defaults to a file level cache dependency. Which means that any time a file changes, the cache is invalidated. This is bad news because, thanks to the ephemeral file system, the file is "changed" every time we push code (or anytime Heroku switches the dyno, which could be ANYtime), invalidating the entire cache along the way, meaning every asset has to be republished to S3. Instead of relying on a file cache dependency, I've modified the S3 extension to not invalidate the cache at all but now uses a simple version number variable. This way, the staging server can be on a version ahead of the production server, and once all the testing is done, I just have to change the variable on the production server and it points to the right collection in IronCache, which points to the right assets on S3.
What Have You Run Into?
I had to jump through a lot of hoops to finally get everything squared away and running on Heroku, but I've finally got it set up and ready to go and am very happy with the result. I'd love to hear in the comments if you've run into any similar problems on Heroku or with Yii in general.
I work on a lot of projects. I'm building a shedquarters. I currently do a podcast, and I used to do a different podcast.
If you ever have any questions or want to chat, I'm always on Twitter