Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
25 Jan 2011
The Linux, Apache, MySQL, and PHP (LAMP) architecture is one of the most popular
choices for web server architectures in use today. Author John Mertic examines five
things every LAMP application should take advantage of for optimum performance.
Introduction
Major web properties like Wikipedia, Facebook, and Yahoo! use the LAMP
architecture to serve millions of requests a day, while web application software like
Wordpress, Joomla, Drupal, and SugarCRM use this architecture to enable
organizations to deploy web-based applications easily.
The strength of the architecture lies in its simplicity. While stacks like .NET and
Java™ technology may use massive hardware, expensive software stacks, and
complex performance tuning, the LAMP stack can run on commodity hardware,
using open source software stacks. Because the software stack is a loose set of
components rather than a monolithic stack, tuning for performance can be a
challenge since each component needs to be analyzed and tuned.
However, there are several simple performance tasks that can have a huge impact
on the performance of websites of any size. In this article, we will look at five such
tasks designed to optimize LAMP application performance. These items should
require very little if any architecture changes to your application, making them safe
and easy options to maximize the responsiveness and hardware requirements for
your web application.
The easiest thing to boost performance of any PHP application (the "P" in LAMP, of
course) is to take advantage of an opcode cache. For any website I work with, it's
the one thing I make sure is present, since the performance impact is huge (many
times with response times half of what they are without an opcode cache). But the
big question most people new to PHP have is why the improvement is so drastic.
The answer lies in how PHP handles web requests. Figure 1 outlines the flow of a
PHP request.
Since PHP is an interpreted language rather than a compiled one like C or the Java
language, the entire parse-compile-execute steps are carried out for every request.
You can see how this can be time- and resource-consuming, especially when scripts
rarely change between requests. After the script is parsed and compiled, the script is
in a machine parseable state as a series of opcodes. This is where an opcode cache
comes into effect. It caches these compiled scripts as a series of opcodes to avoid
the parse and compile steps for every request. You can see how such a workflow
would work in Figure 2.
So when the cached opcodes of a PHP script exists, we can skip by the parse and
compile steps of the PHP request process and directly execute the cache opcodes
and output the results. The checking algorithm takes care of situations where you
may have made a change to the script file, so on the first request of the changed
script, the opcodes will be automatically recompiled and cached then for subsequent
requests, replacing the cached script.
Opcode caches have long been popular for PHP, with some of the first ones coming
about back in the heyday of PHP V4. Today there are a few popular choices that are
in active development and being used:
• Alternative PHP Cache (APC) is probably the most popular opcode cache
for PHP (see Resources). It is developed by several core PHP developers
and has had major contributions to it, gaining its speed and stability from
engineers at Facebook and Yahoo! It also sports several other speed
improvements for handling PHP requests, including a user cache
component we'll look at later in this article.
• Wincache is an opcode cache that is most actively developed by the
Internet Information Services (IIS) team at Microsoft® for use only on
Windows® using the IIS web server (see Resources). It was developed
predominately in an effort to make PHP a first-class development platform
on the Windows-IIS-PHP stack, as APC was known not to work well on
that stack. It is very similar to APC in function and sports a user cache
component, as well as a built-in session handler to leverage Wincache
directly as a session handler.
• eAccelerator is a fork of one of the original PHP caches, the Turck
MMCache opcode cache (see Resources). Unlike APC and Wincache, it
is only an opcode cache and optimizer, so it does not contain the user
cache components. It is fully compatible across UNIX® and Windows
stacks, and it is quite popular for sites that don't intend to leverage the
additional features APC or Wincache provide. This is often the case if you
will be using a solution like memcache to have a separate user cache
server for a multi-web server environment.
Without a doubt, an opcode cache is the first step in speeding up PHP by removing
the need to parse and compile a script on every request. Once this first step is
completed, you should see an improvement in response time and server load. But
there is more you can do to optimize PHP, which we'll look next.
Let's take a look at a few items that are important to help performance.
There are several php.ini settings that should be disabled, since they are often used
for backward-compatibility:
There are some good performance options you can enable in the php.ini file to give
your scripts a bit of a speed boost:
So what's the best way to deal with this? There are a few things you can do to speed
this up.
• Use absolute paths for all require() and include() calls. This will
make it more clear to PHP the exact file you are wishing to include, thus
not needing to check the entire include_path for your file.
• Keep the number of entries in the include_path low. This will help for
situations where it's difficult to provide an absolute path for every
require() and include() call (often the case in large, legacy
applications) by not checking locations where the file you are including
won't be.
APC and Wincache also have mechanisms for caching the results of file status
checks made by PHP, so repeated file-system checks are not needed. They are
most effective when you keep your include file names static rather than
variable-driven, so it's important to try to do this whenever possible.
Database queries can become quite intense on their own, often pegging a CPU at
100 percent for doing simple SELECT statement with reasonable size datasets. If
both your web server and database server are competing for CPU time on a single
machine, this will definitely slow down your request. Thus I consider it a good first
step to have the web server and database server on separate machines and be sure
you make your database server the beefier of the two (database servers love lots of
memory and multiple CPUs).
Probably the biggest issues with database performance come as a result of poor
database design and missing indexes. SELECT statements are usually
overwhelmingly the most common types of queries run in a typical web application.
They are also the most time-consuming queries run on a database server.
Additionally, these kinds of SQL statements are the most sensitive to proper
indexing and database design, so look to the following pointers for tips for optimal
performance.
• Make sure each table has a primary key. This provides the table a default
order and a fast way to join the table against other tables.
• Make sure any foreign keys in a table (that is, keys that link a record to a
record in another table) are properly indexed. Many databases will
enforce constraints on these keys automatically so that value actually
matches a record in the another table, which can help this out.
• Try to limit the number of columns in a table. Too many columns in a table
can make the scan time for queries much longer than if there are just a
few columns. In addition, if you have a table with many columns that
aren't typically used, you are also wasting disk space with NULL value
fields. This is also true with variable size fields, such as text or blob,
where the table size can grow much larger than needed. In this case, you
should consider splitting off the additional columns into a different table,
joining them together on the primary key of the records.
Analyze the queries being run on the server
The best tool for improving database performance is analyzing what queries are
being run on your database server and how long they are taking to run. Just about
every database out there has tools for doing this. With MySQL, you can take
advantage of the slow query log to find the problematic queries. To use it, set the
slow_query_log setting to 1 in the MySQL configuration file, then log_output to
FILE to have them logged to the file hostname-slow.log. You can set the
long_query_time threshold to how long the query must run in number of seconds
to be considered a "slow query." I'd recommend setting this to 5 seconds at first and
move it down to 1 second over time, depending upon your data set. If you look at
this file, you'll see the queries detailed similar to Listing 1.
The key thing we want to look at is Query_time, which shows how long the query
took. Another thing to look at is the numbers of Rows_sent and Rows_examined,
since these can point to situations where a query might be written incorrectly if it's
looking at too many rows or returning too many rows. You can delve deeper into
how a query is written by prepending EXPLAIN to the query, which will return the
query plan instead of the result set, as show in Listing 2.
mysql> explain select * from accounts inner join leads on accounts.id = leads.account_id;
+----+-------------+----------+--------+--------------------------+---------+---
| id | select_type | table | type | possible_keys
| key | key_len | ref | rows | Extra |
+----+-------------+----------+--------+--------------------------+---------+--------
| 1 | SIMPLE | leads | ALL | idx_leads_acct_del | NULL | NULL
| NULL | 200 | |
| 1 | SIMPLE | accounts | eq_ref | PRIMARY,idx_accnt_id_del | PRIMARY | 108
| sugarcrm.leads.account_id | 1 | |
+----+-------------+----------+--------+--------------------------+---------+---------
2 rows in set (0.00 sec)
The MySQL manual dives much deeper into the topic of the EXPLAIN output (see
Resources), but the big thing I look at is places where the 'type' column is 'ALL',
since this requires MySQL to do a full table scan and doesn't use a key for a lookup.
These help point you to places where adding indexes will significantly help query
speed.
Two of the opcode caches we looked at earlier, APC and Wincache, have facilities
for doing just this, where you can store PHP data directly into a shared memory
segment for quick retrieval. Listing 3 provides an example on how to do this.
<?php
function getListOfUsers()
{
$list = apc_fetch('getListOfUsers');
if ( empty($list) ) {
$conn = new PDO('mysql:dbname=testdb;host=127.0.0.1', 'dbuser', 'dbpass');
$sql = 'SELECT id, name FROM users ORDER BY name';
foreach ($conn->query($sql) as $row) {
$list[] = $row;
}
apc_store('getListOfUsers',$list);
}
return $list;
}
We'll only need to do the query one time. Afterward, we push the result into the APC
user cache under the key getListOfUsers. From here on out, until the cache
expires, you will be able to fetch the result array directly out of cache, skipping over
the SQL query.
APC and Wincache aren't the only choices for a user cache; memcache and Redis
are other popular choices that don't require you to run the user cache on the same
server as the Web server. This gives added performance and flexibility, especially if
your web application is scaled out across several Web servers.
Conclusion
In this article, we looked at five simple ways to tune your LAMP application for better
performance. We looked at techniques not only at the PHP level, by leveraging an
opcode cache and optimizing the PHP configuration, but also looked at optimizing
your database design for proper indexing. We also took a look at leveraging a user
cache (using APC as an example) to show how you can avoid repeated database
calls when the data doesn't change very often.
Downloads
Description Name Size Download
method
Source code os-5waystunelamp.zip HTTP
Resources
Learn
• "A PHP V5 migration guide": Learn how to migrate code developed in PHP V4
to V5.
• Planet PHP is the PHP developer community news source.
• The MySQL manual dives much deeper into the topic of the EXPLAIN output.
• PHP.net is the central resource for PHP developers.
• Check out the "Recommended PHP reading list."
• Browse all the PHP content on developerWorks.
• Follow developerWorks on Twitter.
• Expand your PHP skills by checking out IBM developerWorks' PHP project
resources.
• To listen to interesting interviews and discussions for software developers,
check out developerWorks podcasts.
• Using a database with PHP? Check out the Zend Core for IBM, a seamless,
out-of-the-box, easy-to-install PHP development and production environment
that supports IBM DB2 V9.
• Stay current with developerWorks' Technical events and webcasts.
• Check out upcoming conferences, trade shows, webcasts, and other Events
around the world that are of interest to IBM open source developers.
• Visit the developerWorks Open source zone for extensive how-to information,
tools, and project updates to help you develop with open source technologies
and use them with IBM's products, as well as our most popular articles and
tutorials.
• Watch and learn about IBM and open source technologies and product
functions with the no-cost developerWorks On demand demos.
Get products and technologies
• Alternative PHP Cache, is probably the most popular opcode cache for PHP.
• Wincache is an opcode cache that is most actively developed by the IIS team at
Microsoft for use only on Windows using the IIS (Internet Information Services)
Web server.
• eAccelerator is a fork of one of the original PHP caches, the Turck MMCache
opcode cache.
• Innovate your next open source development project with IBM trial software,
available for download or on DVD.
• Download IBM product evaluation versions or explore the online trials in the
IBM SOA Sandbox and get your hands on application development tools and
middleware products from DB2®, Lotus®, Rational®, Tivoli®, and
WebSphere®.
Discuss
• Get involved in the developerWorks community. Connect with other
developerWorks users while exploring the developer-driven blogs, forums,
groups, and wikis.
• Participate in developerWorks blogs and get involved in the developerWorks
community.
• Participate in the developerWorks PHP Forum: Developing PHP applications
with IBM Information Management products (DB2, IDS).