A few weeks ago, I was brought in to help raise the performance of a site running tiki that was under high load.
I saw Rasmus Lerdorf give a very interesting talk on PHP optimization called "Getting Rich with PHP5." In it, he recommends the use of valgrind as a way of holistically profiling a web server running an application. That's a nice way of seeing what's holding you down when you have that luxury, but what can you do when the machine is under heavy load?
The first thing I did was make sure that the machine had an opcode cache installed. It did- eaccelerator. Avoiding compiliation on each request is a good thing. What next?
I decided to get an idea for what the apache threads were doing by using strace(1) to watch the syscalls on a live thread. Profiling this way isn't guaranteed to make the low-hanging fruit obvious to you, but if you're comfortable in assuming that disk IO is your bottleneck, it can sure help give you ideas.
One thing I did right off the bat is look at what files were being opened by the thread. If they were php files, odds were they were about to be opened and parsed by the PHP engine - which isn't a good thing. Ideally, this would only have to happen once. In the real world of conditional and dynamic includes, however, opcode caches don't always deliver.
open("/var/local/tiki/lib/smarty/libs/internals/core.load_plugins.php", O_RDONLY) = 7
open("/var/local/tiki/lib/smarty/libs/internals/core.assemble_plugin_filepath.php", O_RDONLY) = 7
open("/var/local/tiki/lib/smarty/libs/plugins/function.html_image.php", O_RDONLY) = 7
open("/var/local/tiki/lib/smarty/libs/internals/core.assemble_plugin_filepath.php", O_RDONLY) = 7
Between this and other tests I've done of template engines in PHP, it seems that most template engines just don't play all that well with opcode caches. Which is unfortunate. It'd be nice if the Smarty team addressed that in the future, but given that the Smarty project's goal is to reinvent the wheel, I'm not holding my breath.
Slightly discouraged and with the machine becoming increasingly unresponsive, I was informed that the sysadmins had set up proxies running mod_proxy to handle the front end traffic. However, they noticed that their caches weren't filling up. We fixed this for media with the apache mod_expires module, and for php scripts we manually set, using header, an Expires and Cache-Control header. There's an excellent tutorial available if that sounds like greek to you. This was largely guided by the excellent guide, They still didn't click - what gives?
I browsed to the site directly and looked at the HTTP headers using curl. I noticed it was setting a cookie with a PHP session ID - of course! start_session() was being called in an include of tiki, and start_session sets a cookie. Pages with cookies aren't going to be cached by most real-world caches. So, I made start_session() conditional, and arranged for it to run only on pages I knew needed access to the session data.
The machine was doing fewer requests now, but the IO still was hurting hard. I reached for my strace output, and saw a bunch of :
open("/var/local/tiki/lang/es/language.php", O_RDONLY) = 7
Gross! Worse still, it was crawling through in the include path several times before finding the file, causing several expensive filesystem syscalls per request, all to load, and compile, this roughly 300KB file.
I went and found the code that was causing this:

function tra($content, $lg='') {
global $lang_use_db;
global $language;
if ($lang_use_db != 'y') {
if ($lg == "" || $lg == $language) {
global $lang;
include_once("lang/$language/language.php");
}
else
include ("lang/$lg/language.php");

Opcode caches are known to not play well with conditional includes such as include_once or an include inside of a branch, and this function didn't have a shortage of this problem. The kicker? All the language.php files were doing was setting a $lang array. It was a perfect job for memcached.

function tra($content, $lg='') {
global $lang_use_db;
global $language;
global $memcache_obj;
global $lang;
if(empty($lg)) $lg = $language;
if ($lang_use_db != 'y') {
if(!$lang) {
$lang = memcache_get($memcache_obj, $lg);
}
if(!$lang) {
include ("lang/$lg/language.php");
memcache_set($memcache_obj, $lg, $lang, 0, 600);
}

And at the top of the file somewhere, I created a $memcache_obj object with memcached_pconnect. Instantly, the load on the machine dropped, and the problem them became mysql threads, and not apache threads.
Tikiwiki (mis)uses a tiki_searchindex table, and while I now have some idea of what this does, exactly, there were aspects of what I saw and read in the code that are still bizarre to me. The important things to note, however, is that the database was almost constantly doing inserts and deletes on this table, and the content of the table was an index for searching. I pruned the table a bit and recreated the table with the MEMORY engine, and the performance increased markedly.
That was enough to keep the site online. I was able to go in to this application - without any prior exposure to Tiki - and through simple analysis of what the machine was actually doing, able to make surgical changes to the codebase to get the extra oomph that the system needed. In the end, most of what I did wouldn't have been possible were it not for strace(1). I should point out for the BSD users in the crowd, ktrace(1) works very well for me and is less buggy than the procfs-based strace port, or at least on FreeBSD.