Commit Graph

126 Commits

Author SHA1 Message Date
Glenn Strauss 05dc3d123a [core] better asm for binary num to ascii string
compiler optimizers generally convert div to an equivalent mul,
though not always optimally for modulus (%).  In places where
lighttpd is using both quotient and remainder, calculate the
remainder from the quotient.

x-ref: inspiration:
  https://lemire.me/blog/2019/02/08/faster-remainders-when-the-divisor-is-a-constant-beating-compilers-and-libdivide/
  https://lemire.me/blog/2019/02/20/more-fun-with-fast-remainders-when-the-divisor-is-a-constant/
2021-11-12 17:56:38 -05:00
Glenn Strauss 7b615d5d24 [multiple] de-dup file and piped loggers (fixes #3101)
de-dup file and piped loggers for error logs and access logs

x-ref:
  "RFE: de-dup file and piped loggers"
  https://redmine.lighttpd.net/issues/3101
2021-09-13 03:37:11 -04:00
Glenn Strauss 243510dbb4 [core] fdlog.[ch]; fdevent_*_logger_* -> fdlog_*
code move and rename fdevent_*_logger_*() to fdlog_*()
2021-09-11 05:31:09 -04:00
Glenn Strauss 309c1693ac [multiple] Y2038 32-bit signed time_t mitigations
Most OS platforms have already provided solutions to
Y2038 32-bit signed time_t 5 - 10 years ago (or more!)
Notable exceptions are Linux i686 and FreeBSD i386.

Since 32-bit systems tend to be embedded systems,
and since many distros take years to pick up new software,
this commit aims to provide Y2038 mitigations for lighttpd
running on 32-bit systems with Y2038-unsafe 32-bit signed time_t

* Y2038: lighttpd 1.4.60 and later report Y2038 safety
  $ lighttpd -V
    + Y2038 support                                    # Y2038-SAFE
  $ lighttpd -V
    - Y2038 support (unsafe 32-bit signed time_t)      # Y2038-UNSAFE

* Y2038: general platform info
  * Y2038-SAFE: lighttpd 64-bit builds on platforms using 64-bit time_t
      - all major 64-bit platforms (known to this author) use 64-bit time_t
  * Y2038-SAFE: lighttpd 32-bit builds on platforms using 64-bit time_t
      - Linux x32 ABI (different from i686)
      - FreeBSD all 32-bit and 64-bit architectures *except* 32-bit i386
      - NetBSD 6.0 (released Oct 2012) all 32-bit and 64-bit architectures
      - OpenBSD 5.5 (released May 2014) all 32-bit and 64-bit architectures
      - Microsoft Windows XP and Visual Studio 2005 (? unsure ?)
        Another reference suggests Visual Studio 2015 defaults to 64-bit time_t
      - MacOS 10.15 Catalina (released 2019) drops support for 32-bit apps
  * Y2038-SAFE: lighttpd 32-bit builds on platforms using 32-bit unsigned time_t
      - e.g. OpenVMS (unknown if lighttpd builds on this platform)
  * Y2038-UNSAFE: lighttpd 32-bit builds on platforms using 32-bit signed time_t
      - Linux 32-bit (including i686)
          - glibc 32-bit library support not yet available for 64-bit time_t
              - https://sourceware.org/glibc/wiki/Y2038ProofnessDesign
              - Linux kernel 5.6 on 32-bit platforms does support 64-bit time_t
                https://itsubuntu.com/linux-kernel-5-6-to-fix-the-year-2038-issue-unix-y2k/
              - https://www.gnu.org/software/libc/manual/html_node/64_002dbit-time-symbol-handling.html
                "Note: at this point, 64-bit time support in dual-time
                 configurations is work-in-progress, so for these
                 configurations, the public API only makes the 32-bit time
                 support available. In a later change, the public API will
                 allow user code to choose the time size for a given
                 compilation unit."
              - compiling with -D_TIME_BITS=64 currently has no effect
          - glibc recent (Jul 2021) mailing list discussion
              - https://public-inbox.org/bug-gnulib/878s2ozq70.fsf@oldenburg.str.redhat.com/T/
      - FreeBSD i386
      - DragonFlyBSD 32-bit

* Y2038 mitigations attempted on Y2038-UNSAFE platforms (32-bit signed time_t)
  * lighttpd prefers system monotonic clock instead of realtime clock
    in places where realtime clock is not required
  * lighttpd treats negative time_t values as after 19 Jan 2038 03:14:07 GMT
  * (lighttpd presumes that lighttpd will not encounter dates before 1970
    during normal operation.)
  * lighttpd casts struct stat st.st_mtime (and st.st_*time) through uint64_t
    to convert negative timestamps for comparisions with 64-bit timestamps
    (treating negative timestamp values as after 19 Jan 2038 03:14:07 GMT)
  * lighttpd provides unix_time64_t (int64_t) and
  * lighttpd provides struct unix_timespec64 (unix_timespec64_t)
    (struct timespec equivalent using unix_time64_t tv_sec member)
  * lighttpd provides gmtime64_r() and localtime64_r() wrappers
    for platforms 32-bit platforms using 32-bit time_t and
    lighttpd temporarily shifts the year in order to use
    gmtime_r() and localtime_r() (or gmtime() and localtime())
    from standard libraries, before readjusting year and passing
    struct tm to formatting functions such as strftime()
  * lighttpd provides TIME64_CAST() macro to cast signed 32-bit time_t to
    unsigned 32-bit and then to unix_time64_t

* Note: while lighttpd tries handle times past 19 Jan 2038 03:14:07 GMT
  on 32-bit platforms using 32-bit signed time_t, underlying libraries and
  underlying filesystems might not behave properly after 32-bit signed time_t
  overflows (19 Jan 2038 03:14:08 GMT).  If a given 32-bit OS does not work
  properly using negative time_t values, then lighttpd likely will not work
  properly on that system.

* Other references and blogs
  - https://en.wikipedia.org/wiki/Year_2038_problem
  - https://en.wikipedia.org/wiki/Time_formatting_and_storage_bugs
  - http://www.lieberbiber.de/2017/03/14/a-look-at-the-year-20362038-problems-and-time-proofness-in-various-systems/
2021-09-04 08:08:26 -04:00
Glenn Strauss f1e8a82f1a [multiple] inline struct in con->dst_addr_buf
(mod_extforward recently changed to use buffer_move() to save addr
 instead of swapping pointers)
2021-08-27 02:16:54 -04:00
Glenn Strauss af3df29ae8 [multiple] reduce redundant NULL buffer checks
This commit is a large set of code changes and results in removal of
hundreds, perhaps thousands, of CPU instructions, a portion of which
are on hot code paths.

Most (buffer *) used by lighttpd are not NULL, especially since buffers
were inlined into numerous larger structs such as request_st and chunk.

In the small number of instances where that is not the case, a NULL
check is often performed earlier in a function where that buffer is
later used with a buffer_* func.  In the handful of cases that remained,
a NULL check was added, e.g. with r->http_host and r->conf.server_tag.

- check for empty strings at config time and set value to NULL if blank
  string will be ignored at runtime; at runtime, simple pointer check
  for NULL can be used to check for a value that has been set and is not
  blank ("")
- use buffer_is_blank() instead of buffer_string_is_empty(),
  and use buffer_is_unset() instead of buffer_is_empty(),
  where buffer is known not to be NULL so that NULL check can be skipped
- use buffer_clen() instead of buffer_string_length() when buffer is
  known not to be NULL (to avoid NULL check at runtime)
- use buffer_truncate() instead of buffer_string_set_length() to
  truncate string, and use buffer_extend() to extend

Examples where buffer known not to be NULL:
  - cpv->v.b from config_plugin_values_init is not NULL if T_CONFIG_BOOL
    (though we might set it to NULL if buffer_is_blank(cpv->v.b))
  - address of buffer is arg (&foo)
    (compiler optimizer detects this in most, but not all, cases)
  - buffer is checked for NULL earlier in func
  - buffer is accessed in same scope without a NULL check (e.g. b->ptr)

internal behavior change:
  callers must not pass a NULL buffer to some funcs.
  - buffer_init_buffer() requires non-null args
  - buffer_copy_buffer() requires non-null args
  - buffer_append_string_buffer() requires non-null args
  - buffer_string_space() requires non-null arg
2021-08-27 02:16:53 -04:00
Glenn Strauss a6d1dccad3 [multiple] strftime %F and %T
strftime %F for %Y-%m-%d, and %T for %H:%M:%S
2021-04-02 01:16:08 -04:00
Glenn Strauss f711207d5c [mod_accesslog] reformat numeric timestamp code 2021-03-26 22:38:36 -04:00
Glenn Strauss 8308915b4e [mod_accesslog] strftime %z for numeric timestamp 2021-03-26 22:38:36 -04:00
Glenn Strauss 069c0fff21 [mod_accesslog] reformat numeric timestamp 2021-03-26 22:38:36 -04:00
Glenn Strauss dbe3e2361b [multiple] prefer monotonic time for internal use
Note: monotonic time does not change while VM is suspended

Continue to use real time where required by HTTP protocol, for logging
and for other user-visible instances, such as mod_status, as well as for
external databases and caches.
2021-03-11 18:59:53 -05:00
Glenn Strauss 5c2f5577b4 [core] save parsed listen addrs at startup
save parsed listen addrs at startup for reuse at runtime

srv_socket->srv_token is normalized at startup and contains IP and port.
save offset to colon, if present, or else length of string (unix socket)

At runtime, srv_token_colon can be quickly used as length of IP string
(without port) or, if not length of string, offset of stringified port
following the colon.
2021-03-07 04:38:34 -05:00
Glenn Strauss 4a600dabd5 [mod_auth] close HTTP/2 connection after bad pass
mitigation slows down brute force password attacks

x-ref:
  "Possible feature: authentication brute force hardening"
  https://redmine.lighttpd.net/boards/3/topics/8885
2021-02-06 08:29:41 -05:00
Glenn Strauss 122094e3e3 [multiple] employ http_date.h, sys-time.h
- replace use of strptime() w/ implementation specialized for HTTP dates
- use thread-safe gmtime_r(), localtime_r() (replace localtime, gmtime)
2020-12-24 16:13:20 -05:00
Glenn Strauss 949662d27e [multiple] add some missing config cleanup
(thx stbuehler)
2020-10-24 16:08:21 -04:00
Glenn Strauss 55fb46f695 [mod_accesslog] update defaults after cycling log
(thx avij)

must update the cached copy of global scope config after cycling log.
Although (accesslog_st *) is modified in-place, the log_access_fd member
of (accesslog_st *) is copied into the cache and must be updated after
cycling logs in the global scope.
2020-10-24 14:38:47 -04:00
Glenn Strauss 81029b8b51 [multiple] inline chunkqueue where always alloc'd
inline struct chunkqueue where always allocated in other structs

(memory locality)
2020-10-11 12:19:27 -04:00
Glenn Strauss 1a64c9e2f7 [core] reuse r->start_hp.tv_sec for r->start_ts
(remove duplicated field from (request_st *))
2020-10-11 12:19:27 -04:00
Glenn Strauss 2e0676fd6d [core] extend (data_string *) to store header id
(optional addition to (data_string *), used by http_header.[ch])

extend (data_string *) instead of creating another data_* TYPE_*
  (new data type would probably have (data_string *) as base class)
  (might revisit choice in the future)

HTTP_HEADER_UNSPECIFIED has been removed.  It was used in select
locations as an optimization to avoid looking up enum header_header_e
before checking the array, but the ordering in the array now relies
on having the id.  Having the id allows for a quick check if a common
header is present or not in the htags bitmask, before checking the
array, and allows for integer comparison in the log(n) search of the
array, instead of strncasecmp().

With HTTP_HEADER_UNSPECIFIED removed, add optimization to set bit
in htags for HTTP_HEADER_OTHER when an "other" header is added,
but do not clear the bit, as there might be addtl "other" headers
2020-10-11 12:19:26 -04:00
Glenn Strauss d4937e29f1 [mod_accesslog,mod_rrdtool] HTTP/2 basic accounting
Note: rrdtool counts do not include HTTP/2 protocol overhead.
Continue to count mod_rrdtool per request rather than per connection
so that data is updated after each request, rather than aggregated
to the end of a potentially long-lived connection with many keep-alives.
2020-10-03 09:05:38 -04:00
Glenn Strauss afc2025d8e [core] reset connection counters per connection
reset connection counters per connection, not per request

adjust mod_accesslog and mod_rrdtool usage

continue to count mod_rrdtool per request rather than per connection
so that data is updated after each request, rather than aggregated
to the end of a potentially long-lived connection with many keep-alives.
2020-10-03 09:05:38 -04:00
Glenn Strauss a8f8d5edc0 [core] HTTP_VERSION_2 2020-08-13 15:05:25 -04:00
Glenn Strauss f7bac374ee [mod_accesslog] process backslash-escapes in fmt
Process basic backslash-escapes in format string from lighttpd.conf
Supported sequences: \a \b \f \n \r \t \v
Other backslash-sequences are replaces with the char following backslash

(Apache mod_log_config supports \n and \t as special-cases)
2020-07-08 19:54:30 -04:00
Glenn Strauss a0029b21a1 [core] remove r->uri.path_raw; generate as needed
(r->uri.path_raw previously duplicated from r->target, minus query-part)
2020-07-08 19:54:29 -04:00
Glenn Strauss 7c7f8c467c [multiple] split con, request (very large change)
NB: r->tmp_buf == srv->tmp_buf (pointer is copied for quicker access)

NB: request read and write chunkqueues currently point to connection
    chunkqueues; per-request and per-connection chunkqueues are
    not distinct from one another
      con->read_queue  == r->read_queue
      con->write_queue == r->write_queue

NB: in the future, a separate connection config may be needed for
    connection-level module hooks.  Similarly, might need to have
    per-request chunkqueues separate from per-connection chunkqueues.
    Should probably also have a request_reset() which is distinct from
    connection_reset().
2020-07-08 19:54:29 -04:00
Glenn Strauss aca9d45adf [core] move request state into (request_st *)
NB: in the future, a separate connection state may be needed for
    connection-level state (different from request state)
2020-07-08 19:54:29 -04:00
Glenn Strauss 8131e4396d [core] move addtl request-specific struct members 2020-07-08 19:54:29 -04:00
Glenn Strauss 1474be7859 [core] move addtl request-specific struct members 2020-07-08 19:54:29 -04:00
Glenn Strauss 6fe031ef37 [core] move request start ts into (request_st *)
move request start timestamps into (request_st *)
2020-07-08 19:54:29 -04:00
Glenn Strauss b157ee8dfa [mod_accesslog] log_access_record() fmt log record
separate func to append log record to buffer
2020-07-08 19:54:29 -04:00
Glenn Strauss 057d83c50b [core] move keep_alive flag into request_st 2020-07-08 19:54:29 -04:00
Glenn Strauss 4069dc2ad7 [mod_accesslog] flush file log buffer at 8k size 2020-07-08 19:54:29 -04:00
Glenn Strauss cbdbd60b35 [multiple] quiet clang compiler warnings 2020-07-08 19:54:28 -04:00
Glenn Strauss b5775b9951 [multiple] reduce direct use of srv->errh 2020-07-08 19:54:28 -04:00
Glenn Strauss c8cd7cf49b [multiple] extern log_epoch_secs
replace srv->cur_ts
2020-07-08 19:54:28 -04:00
Glenn Strauss 409bba80b1 [multiple] reduce direct use of srv->cur_ts 2020-07-08 19:54:28 -04:00
Glenn Strauss 50bdb55de8 [multiple] connection hooks no longer get (srv *)
(explicit (server *) not passed; available in con->srv)
2020-07-08 19:54:28 -04:00
Glenn Strauss 010c28949c [multiple] prefer (connection *) to (srv *)
convert all log_error_write() to log_error() and pass (log_error_st *)

use con->errh in preference to srv->errh (even though currently same)

avoid passing (server *) when previously used only for logging (errh)
2020-07-08 19:54:28 -04:00
Glenn Strauss b73949e03f [multiple] plugin.c handles common FREE_FUNC code
(simpler for modules; less boilerplate to cut-n-paste)
2020-07-08 18:08:51 -04:00
Glenn Strauss 74bbb3077f [mod_accesslog] use config_plugin_values_init()
inline various structures and use C99 VLA to increase performance
(reduce pointer chasing)
2020-07-08 18:08:51 -04:00
Glenn Strauss e2de4e581e [core] const char *name in struct plugin
put void *data (always used) as first member of struct plugin

add int nconfig member to PLUGIN_DATA

calloc() inits p->data to NULL
2020-05-23 17:59:29 -04:00
Glenn Strauss 36f64b26a1 [core] simpler config_check_cond()
optimize for common case where condition has been evaluated for
the request and a cached result exists

(also: begin isolating data_config)
2020-05-23 17:59:29 -04:00
Glenn Strauss b6e0880ae6 [mod_accesslog] avoid alloc for parsing cookie val 2020-05-23 17:59:29 -04:00
Glenn Strauss 6eb34ef5ab [core] add const to callers of http_header_*_get()
(The few places where value is modified in-place were not made const)
2020-02-24 11:15:32 -05:00
Glenn Strauss 47a758f959 [core] inline buffer key for *_patch_connection()
handle buffer key as part of DATA_UNSET in *_patch_connection()
(instead of key being (buffer *))
2020-02-24 11:15:32 -05:00
Glenn Strauss 330c39c694 [mod_accesslog] parse multiple cookies (fixes #2986)
(thx xoneca)

x-ref:
  "Cookie format specifier is broken"
  https://redmine.lighttpd.net/issues/2986
2020-01-26 00:41:05 -05:00
Mohammed Sadiq 6a988bb0d0 [multiple] cleaner calloc use in SETDEFAULTS_FUNC
github: closes #99

x-ref:
  "cleaner calloc use in SETDEFAULTS_FUNC"
  https://github.com/lighttpd/lighttpd1.4/pull/99
2019-04-20 02:09:04 -04:00
Glenn Strauss d28bac32fe [multiple] reduce code dup in list resizing
reduce code duplication in list resizing
realloc() of NULL ptr has behavior similar to malloc()

Note that if initial size == 0, then code used to adjust size
must be += x to ensure the size is non-zero for reallocation.
(Multiplying 0 * x, e.g. power-2 resizing, will result in 0.)
2019-02-12 22:36:04 -05:00
Glenn Strauss daa5f7c576 [mod_accesslog] attempt to reconstruct req line
cease http_request_parse_reqline() unconditionally copying request line,
as request line is currently used only by mod_accesslog 'r' format
2019-02-10 03:10:11 -05:00
Glenn Strauss f69bd9cdb8 [core] perf: simple, quick buffer_clear()
quickly clear buffer instead of buffer_string_set_length(b, 0) or
buffer_reset(b).  Avoids free() of large buffers about to be reused,
or buffers that are module-scoped, persistent, and reused.

(buffer_reset() should still be used with buffers in connection *con
 when the data in the buffers is supplied by external, untrusted source)
2018-11-23 00:37:38 -05:00