This commit is a large set of code changes and results in removal of
hundreds, perhaps thousands, of CPU instructions, a portion of which
are on hot code paths.
Most (buffer *) used by lighttpd are not NULL, especially since buffers
were inlined into numerous larger structs such as request_st and chunk.
In the small number of instances where that is not the case, a NULL
check is often performed earlier in a function where that buffer is
later used with a buffer_* func. In the handful of cases that remained,
a NULL check was added, e.g. with r->http_host and r->conf.server_tag.
- check for empty strings at config time and set value to NULL if blank
string will be ignored at runtime; at runtime, simple pointer check
for NULL can be used to check for a value that has been set and is not
blank ("")
- use buffer_is_blank() instead of buffer_string_is_empty(),
and use buffer_is_unset() instead of buffer_is_empty(),
where buffer is known not to be NULL so that NULL check can be skipped
- use buffer_clen() instead of buffer_string_length() when buffer is
known not to be NULL (to avoid NULL check at runtime)
- use buffer_truncate() instead of buffer_string_set_length() to
truncate string, and use buffer_extend() to extend
Examples where buffer known not to be NULL:
- cpv->v.b from config_plugin_values_init is not NULL if T_CONFIG_BOOL
(though we might set it to NULL if buffer_is_blank(cpv->v.b))
- address of buffer is arg (&foo)
(compiler optimizer detects this in most, but not all, cases)
- buffer is checked for NULL earlier in func
- buffer is accessed in same scope without a NULL check (e.g. b->ptr)
internal behavior change:
callers must not pass a NULL buffer to some funcs.
- buffer_init_buffer() requires non-null args
- buffer_copy_buffer() requires non-null args
- buffer_append_string_buffer() requires non-null args
- buffer_string_space() requires non-null arg
enabled by default
disable using server.feature-flags += ("server.pcre_jit" => "disable")
Available since pcre-8.20 (2011), and improved in pcre-8.32 (2012),
PCRE_STUDY_JIT_COMPILE can greatly speed up repeated execution of PCRE
patterns. (https://zherczeg.github.io/sljit/pcre.html)
lighttpd continues to use pcre_exec() instead of pcre_jit_exec(),
even though doing so does not realize all of the performance increase
potentially available with PCRE_STUDY_JIT_COMPILE and pcre_jit_exec().
pcre_jit_exec() is available with PCRE 8.32 and later, if PCRE is
compiled with --enable-jit, but lighttpd does not currently use
pcre_jit_exec() since the PCRE library might not have been compiled
with --enable-jit (though this could be solved with a weak symbol).
Similarly, lighttpd does not currently configure the pcre_jit_stack.
(Using pcre_jit_exec() may be revisited in the future.)
x-ref:
"add support for pcre JIT"
https://redmine.lighttpd.net/issues/2361
tighten struct data_config and config_cond_info
create config key at startup and reuse for debug/trace
separate routine for configparser_parse_condition()
separate routine for configparser_parse_else_condition()
NB: r->tmp_buf == srv->tmp_buf (pointer is copied for quicker access)
NB: request read and write chunkqueues currently point to connection
chunkqueues; per-request and per-connection chunkqueues are
not distinct from one another
con->read_queue == r->read_queue
con->write_queue == r->write_queue
NB: in the future, a separate connection config may be needed for
connection-level module hooks. Similarly, might need to have
per-request chunkqueues separate from per-connection chunkqueues.
Should probably also have a request_reset() which is distinct from
connection_reset().
convert all log_error_write() to log_error() and pass (log_error_st *)
use con->errh in preference to srv->errh (even though currently same)
avoid passing (server *) when previously used only for logging (errh)
array_get_element_klen() is now intended for read-only access
array_get_data_unset() is used by config processing for r/w access
array_get_buf_ptr() is used for r/w access to ds->value (string buffer)
Provide means to encode redirect and rewrite backreference substitutions
In addition to $1 and %1, the following modifiers are now supported,
followed by the number for the backreference, e.g. ${esc:1}
${noesc:...} no escaping
${esc:...} escape all non-alphanumeric - . _ ~ incl double-escape %
${escape:...} escape all non-alphanumeric - . _ ~ incl double-escape %
${escnde:...} escape all non-alphanumeric - . _ ~ but no double-esc %
${tolower:...}
${toupper:...}
%{noesc:...}
%{esc:...}
%{escape:...}
%{escnde:...}
%{tolower:...}
%{toupper:...}
Provide means to substitute URI parts without needing a regex match
(and can be preceded by encoding modifier,
e.g. ${tolower:url.authority})
${url.scheme}
${url.authority}
${url.port}
${url.path}
${url.query}
${qsa} appends query string, if not empty
x-ref:
"[PATCH] mod_redirect: Add support for url-encoding backreferences, map %%n->%n, $$n->$n"
https://redmine.lighttpd.net/issues/443
"Need for URL encoding in mod_redirect and possibly mod_rewrite"
https://redmine.lighttpd.net/issues/911
server.http-parseopts = ( ... ) URL normalization options
Note: *not applied* to CONNECT method
Note: In a future release, URL normalization likely enabled by default
(normalize URL, reject control chars, remove . and .. path segments)
To prepare for this change, lighttpd.conf configurations should
explicitly select desired behavior by enabling or disabling:
server.http-parseopts = ( "url-normalize" => "enable", ... )
server.http-parseopts = ( "url-normalize" => "disable" )
x-ref:
"lighttpd ... compares URIs to patterns in the (1) url.redirect and (2) url.rewrite configuration settings before performing URL decoding, which might allow remote attackers to bypass intended access restrictions, and obtain sensitive information or possibly modify data."
https://www.cvedetails.com/cve/CVE-2008-4359/
"Rewrite/redirect rules and URL encoding"
https://redmine.lighttpd.net/issues/1720
provide standard types in first.h instead of base.h
provide lighttpd types in base_decls.h instead of settings.h
reduce headers exposed by headers for core data structures
do not expose <pcre.h> or <stdlib.h> in headers
move stat_cache_entry to stat_cache.h
reduce use of "server.h" and "base.h" in headers
More specific checks on contents of array lists. Each module using
lists now does better checking on the types of values in the list
(strings, integers, arrays/lists)
This helps prevent misconfiguration of things like cgi.assign,
fastcgi.server, and scgi.server, where source code might be
served as static files if parenthesis are misplaced.
x-ref:
https://redmine.lighttpd.net/boards/2/topics/6571