use stack w/ pcre_exec unless saving captures from config conditions
reduce memory allocations per request where lighttpd.conf does not
contain url.redirect or url.rewrite rules where replacements reference
a match of the enclosing lighttpd.conf condition (e.g. %0, %1, %2 ...)
move cond_cache_t 'patterncount' to cond_match_t 'captures'
While cond_match_t is no longer sized power-2, it is generally expected
to be used much less frequently than before (which was all the time),
since it is now used only with url.redirect or url.rewrite with
references %0, %1, %2, ...
rename data_config_pcre_exec() to config_pcre_match()
and move logic saving pcre result state from
config_check_cond_nocache()
split config_check_cond_nocache() into two funcs
config_check_cond_nocache() participates in recursion
config_check_cond_nocache_eval() evaluates the condition
save config regex captures separately only if used by url.redirect
or url.rewrite replacement directives within the condition
(or for conditions containing directives from any other module
which calls config_capture() for its directives during init)
keep pointer to match data (cond_match_t *) in r->cond_match[]
rather than cond_match_t to reduce data copying in h2_init_stream().
h2_init_stream() copies the results for already-evaluated conditions
to avoid re-evaluating connection-level conditions for each and every
stream. When conditions are reset, then the pointer in r->cond_match[]
is updated when the condition is re-evaluated. (This all assumes that
HTTP/2 connection-level conditions are not unset or re-evaluated once
HTTP/2 streams begin.)
- http_method_buf() returns (const buffer *)
- comment out unused get_http_status_name()
- inline func for http_append_method()
config processing requires a persistent buffer for method on the
off-chance that the config performed a capturing regex match in
$HTTP["method"] condition and used it later (e.g. in mod_rewrite)
(Prior behavior using r->tmp_buf was undefined in this case)
parse $HTTP["remote-ip"] CIDR mask into structured data at startup
note: adds buffer_move() to configparser.y to reduce memory copying
for all config values, and is required for remote-ip to preserve the
structured data added after the config value string. (Alternatively,
could have normalized the remote-ip value after copying into dc->string)
This commit is a large set of code changes and results in removal of
hundreds, perhaps thousands, of CPU instructions, a portion of which
are on hot code paths.
Most (buffer *) used by lighttpd are not NULL, especially since buffers
were inlined into numerous larger structs such as request_st and chunk.
In the small number of instances where that is not the case, a NULL
check is often performed earlier in a function where that buffer is
later used with a buffer_* func. In the handful of cases that remained,
a NULL check was added, e.g. with r->http_host and r->conf.server_tag.
- check for empty strings at config time and set value to NULL if blank
string will be ignored at runtime; at runtime, simple pointer check
for NULL can be used to check for a value that has been set and is not
blank ("")
- use buffer_is_blank() instead of buffer_string_is_empty(),
and use buffer_is_unset() instead of buffer_is_empty(),
where buffer is known not to be NULL so that NULL check can be skipped
- use buffer_clen() instead of buffer_string_length() when buffer is
known not to be NULL (to avoid NULL check at runtime)
- use buffer_truncate() instead of buffer_string_set_length() to
truncate string, and use buffer_extend() to extend
Examples where buffer known not to be NULL:
- cpv->v.b from config_plugin_values_init is not NULL if T_CONFIG_BOOL
(though we might set it to NULL if buffer_is_blank(cpv->v.b))
- address of buffer is arg (&foo)
(compiler optimizer detects this in most, but not all, cases)
- buffer is checked for NULL earlier in func
- buffer is accessed in same scope without a NULL check (e.g. b->ptr)
internal behavior change:
callers must not pass a NULL buffer to some funcs.
- buffer_init_buffer() requires non-null args
- buffer_copy_buffer() requires non-null args
- buffer_append_string_buffer() requires non-null args
- buffer_string_space() requires non-null arg
tighten struct data_config and config_cond_info
create config key at startup and reuse for debug/trace
separate routine for configparser_parse_condition()
separate routine for configparser_parse_else_condition()
(optional addition to (data_string *), used by http_header.[ch])
extend (data_string *) instead of creating another data_* TYPE_*
(new data type would probably have (data_string *) as base class)
(might revisit choice in the future)
HTTP_HEADER_UNSPECIFIED has been removed. It was used in select
locations as an optimization to avoid looking up enum header_header_e
before checking the array, but the ordering in the array now relies
on having the id. Having the id allows for a quick check if a common
header is present or not in the htags bitmask, before checking the
array, and allows for integer comparison in the log(n) search of the
array, instead of strncasecmp().
With HTTP_HEADER_UNSPECIFIED removed, add optimization to set bit
in htags for HTTP_HEADER_OTHER when an "other" header is added,
but do not clear the bit, as there might be addtl "other" headers
group HANDLER_COMEBACK logic in http_response_comeback() and call it
from places that reset state in order to (sometimes partially) reprocess
a request. This includes error handler (server.error-handler),
r->handler_module when cgi.local-redir, and looping in
http_response_prepare() when modules make changes to the request and
return HANDLER_COMEBACK (e.g. mod_rewrite, mod_magnet, mod_cml)
Also, set r->conditional_is_valid closer to where elements are set
(and become valid for use in condition checks), and parse target
in http_request_parse() instead of http_response_prepare()
NB: r->tmp_buf == srv->tmp_buf (pointer is copied for quicker access)
NB: request read and write chunkqueues currently point to connection
chunkqueues; per-request and per-connection chunkqueues are
not distinct from one another
con->read_queue == r->read_queue
con->write_queue == r->write_queue
NB: in the future, a separate connection config may be needed for
connection-level module hooks. Similarly, might need to have
per-request chunkqueues separate from per-connection chunkqueues.
Should probably also have a request_reset() which is distinct from
connection_reset().
convert all log_error_write() to log_error() and pass (log_error_st *)
use con->errh in preference to srv->errh (even though currently same)
avoid passing (server *) when previously used only for logging (errh)
new data structures and interface for processing config directives
(towards more efficient approach to config merging)
continue work to isolate data_config
array_get_element_klen() is now intended for read-only access
array_get_data_unset() is used by config processing for r/w access
array_get_buf_ptr() is used for r/w access to ds->value (string buffer)
even 2 billion is way larger than even extreme operating values
expected for the members in base.h
include some structs directly in struct server, rather than by ptr
quickly clear buffer instead of buffer_string_set_length(b, 0) or
buffer_reset(b). Avoids free() of large buffers about to be reused,
or buffers that are module-scoped, persistent, and reused.
(buffer_reset() should still be used with buffers in connection *con
when the data in the buffers is supplied by external, untrusted source)
save 40 bytes (64-bit), or 16 bytes (32-bit) per data_* element
at the cost of going through indirect function pointer to execute
methods. At runtime, the reset() method is most used among them.