2005-07-06 13:24:10 +00:00
=========================
CML (Cache Meta Language)
=========================
---------------
Module: mod_cml
---------------
:Author: Jan Kneschke
:Date: $Date: 2004/11/03 22:26:05 $
:Revision: $Revision: 1.2 $
:abstract:
2005-07-15 17:46:13 +00:00
CML is a Meta language to describe the dependencies of a page at one side and building a page from its fragments on the other side using LUA.
2006-10-04 13:26:23 +00:00
2005-07-06 13:24:10 +00:00
.. meta::
2005-07-15 17:46:13 +00:00
:keywords: lighttpd, cml, lua
2006-10-04 13:26:23 +00:00
2005-07-06 13:24:10 +00:00
.. contents:: Table of Contents
Description
===========
CML (Cache Meta Language) wants to solves several problems:
* dynamic content needs caching to perform
* checking if the content is dirty inside of the application is usually more expensive than sending out the cached data
* a dynamic page is usually fragmented and the fragments have different livetimes
* the different fragements can be cached independently
Cache Decision
--------------
2006-10-04 13:26:23 +00:00
A simple example should show how to a content caching the very simple way in PHP.
2005-07-06 13:24:10 +00:00
jan.kneschke.de has a very simple design:
* the layout is taken from a template in templates/jk.tmpl
* the menu is generated from a menu.csv file
* the content is coming from files on the local directory named content-1, content-2 and so on
2006-10-04 13:26:23 +00:00
The page content is static as long non of the those tree items changes. A change in the layout
2005-07-06 13:24:10 +00:00
is affecting all pages, a change of menu.csv too, a change of content-x file only affects the
cached page itself.
If we model this in PHP we get: ::
<?php
## ... fetch all content-* files into $content
$cachefile = "/cache/dir/to/cached-content";
function is_cachable($content, $cachefile) {
if (!file_exists($cachefile)) {
return 0;
} else {
$cachemtime = filemtime($cachefile);
}
foreach($content as $k => $v) {
2006-10-04 13:26:23 +00:00
if (isset($v["file"]) &&
2005-07-06 13:24:10 +00:00
filemtime($v["file"]) > $cachemtime) {
return 0;
}
}
if (filemtime("/menu/menu.csv") > $cachemtime) {
return 0;
}
if (filemtime("/templates/jk.tmpl") > $cachemtime) {
return 0;
}
}
2006-10-04 13:26:23 +00:00
2005-07-06 13:24:10 +00:00
if (is_cachable(...), $cachefile) {
readfile($cachefile);
exit();
} else {
# generate content and write it to $cachefile
}
?>
Quite simple. No magic involved. If the one of the files is new than the cached
2006-10-04 13:26:23 +00:00
content, the content is dirty and has to be regenerated.
2005-07-06 13:24:10 +00:00
Now let take a look at the numbers:
* 150 req/s for a Cache-Hit
* 100 req/s for a Cache-Miss
As you can see the increase is not as good as it could be. The main reason as the overhead
of the PHP interpreter to start up (a byte-code cache has been used here).
Moving these decisions out of the PHP script into a server module will remove the need
2006-10-04 13:26:23 +00:00
to start PHP for a cache-hit.
2005-07-06 13:24:10 +00:00
2006-10-04 13:26:23 +00:00
To transform this example into a CML you need 'index.cml' in the list of indexfiles
2005-07-06 13:24:10 +00:00
and the following index.cml file: ::
2006-03-02 14:11:25 +00:00
output_contenttype = "text/html"
2005-07-06 13:24:10 +00:00
2005-07-15 17:46:13 +00:00
b = request["DOCUMENT_ROOT"]
cwd = request["CWD"]
2005-07-06 13:24:10 +00:00
2005-07-17 10:07:53 +00:00
output_include = { b .. "_cache.html" }
2005-07-15 17:46:13 +00:00
trigger_handler = "index.php"
2005-07-17 10:07:53 +00:00
if file_mtime(b .. "../lib/php/menu.csv") > file_mtime(cwd .. "_cache.html") or
2005-09-07 07:33:15 +00:00
file_mtime(b .. "templates/jk.tmpl") > file_mtime(cwd .. "_cache.html") or
file_mtime(b .. "content.html") > file_mtime(cwd .. "_cache.html") then
2005-07-31 13:18:18 +00:00
return CACHE_MISS
2006-10-04 13:26:23 +00:00
else
2005-07-31 13:18:18 +00:00
return CACHE_HIT
2005-07-15 17:46:13 +00:00
end
2005-07-06 13:24:10 +00:00
Numbers again:
* 4900 req/s for Cache-Hit
* 100 req/s for Cache-Miss
Content Assembling
------------------
Sometimes the different fragment are already generated externally. You have to cat them together: ::
<?php
readfile("head.html");
readfile("menu.html");
readfile("spacer.html");
readfile("db-content.html");
readfile("spacer2.html");
readfile("news.html");
readfile("footer.html");
2006-10-04 13:26:23 +00:00
?>
2005-07-06 13:24:10 +00:00
2006-10-04 13:26:23 +00:00
We we can do the same several times faster directly in the webserver.
2005-07-06 13:24:10 +00:00
Don't forget: Webserver are built to send out static content, that is what they can do best.
The index.cml for this looks like: ::
2006-03-02 14:11:25 +00:00
output_contenttype = "text/html"
2006-10-04 13:26:23 +00:00
2005-07-15 17:46:13 +00:00
cwd = request["CWD"]
2006-10-04 13:26:23 +00:00
output_include = { cwd .. "head.html",
2005-07-17 10:07:53 +00:00
cwd .. "menu.html",
cwd .. "spacer.html",
cwd .. "db-content.html",
cwd .. "spacer2.html",
cwd .. "news.html",
cwd .. "footer.html" }
2006-10-04 13:26:23 +00:00
2005-07-31 13:18:18 +00:00
return CACHE_HIT
2005-07-06 13:24:10 +00:00
Now we get about 10000 req/s instead of 600 req/s.
2006-01-03 12:55:09 +00:00
Power Magnet
------------
Next to all the features about Cache Decisions CML can do more. Starting
with lighttpd 1.4.9 a power-magnet was added which attracts each request
2006-10-04 13:26:23 +00:00
and allows you to manipulate the request for your needs.
2006-01-03 12:55:09 +00:00
We want to display a maintainance page by putting a file in a specified
place:
We enable the power magnet: ::
cml.power-magnet = "/home/www/power-magnet.cml"
and create /home/www/power-magnet.cml with: ::
dr = request["DOCUMENT_ROOT"]
if file_isreg(dr .. 'maintainance.html') then
output_include = { 'maintainance.html' }
return CACHE_HIT
end
return CACHE_MISS
For each requested file the /home/www/power-magnet.cml is executed which
checks if maintainance.html exists in the docroot and displays it
2006-10-04 13:26:23 +00:00
instead of handling the usual request.
2006-01-03 12:55:09 +00:00
Another example, create thumbnail for requested image and serve it instead
of sending the big image: ::
## image-url is /album/baltic_winter_2005.jpg
## no params -> 640x480 is served
## /album/baltic_winter_2005.jpg/orig for full size
## /album/baltic_winter_2005.jpg/thumb for thumbnail
dr = request["DOCUMENT_ROOT"]
sn = request["SCRIPT_NAME"]
2006-10-04 13:26:23 +00:00
## to be continued :) ...
2006-01-03 12:55:09 +00:00
trigger_handler = '/gen_image.php'
return CACHE_MISS
2005-07-15 18:19:10 +00:00
Installation
============
2005-07-17 10:07:53 +00:00
You need `lua <http://www.lua.org/>`_ and should install `libmemcache-1.3.x <http://people.freebsd.org/~seanc/libmemcache/>`_ and have to configure lighttpd with: ::
2005-07-15 18:19:10 +00:00
./configure ... --with-lua --with-memcache
To use the plugin you have to load it: ::
server.modules = ( ..., "mod_cml", ... )
2005-07-06 13:24:10 +00:00
Options
=======
2006-10-04 13:26:23 +00:00
:cml.extension:
2005-07-06 13:24:10 +00:00
the file extension that is bound to the cml-module
2005-07-15 17:46:13 +00:00
:cml.memcache-hosts:
hosts for the memcache.* functions
:cml.memcache-namespace:
(not used yet)
2006-01-03 12:55:09 +00:00
:cml.power-magnet:
a cml file that is executed for each request
2005-07-06 13:24:10 +00:00
Language
========
2006-10-04 13:26:23 +00:00
The language used for CML is provided by `LUA <http://www.lua.org/>`_.
2005-07-15 17:46:13 +00:00
Additionally to the functions provided by lua mod_cml provides: ::
tables:
request
- REQUEST_URI
- SCRIPT_NAME
- SCRIPT_FILENAME
- DOCUMENT_ROOT
- PATH_INFO
- CWD
- BASEURI
get
- parameters from the query-string
functions:
string md5(string)
number file_mtime(string)
string memcache_get_string(string)
number memcache_get_long(string)
2006-10-04 13:26:23 +00:00
boolean memcache_exists(string)
2005-07-15 17:46:13 +00:00
2006-10-04 13:26:23 +00:00
What ever your script does, it has to return either CACHE_HIT or CACHE_MISS.
It case a error occures check the error-log, the user will get a error 500. If you don't like
2005-07-15 17:46:13 +00:00
the standard error-page use ``server.errorfile-prefix``.