Commit Graph

205 Commits

Author SHA1 Message Date
Glenn Strauss 1acf9db7d3 [mod_ajp13,mod_fastcgi] check resp w/ content len
limit response body from mod_ajp13 and mod_fastcgi to Content-Length,
if Content-Length is provided in response headers; discard excess
2021-10-27 04:16:38 -04:00
Glenn Strauss 0757d71e14 [core] short-circuit if response body recv w/ hdrs (fixes #3111)
short-circuit if response body completely received with response headers

  "HTTP/2 requests sometimes take very long (missing last chunk)"
2021-10-27 04:16:38 -04:00
Glenn Strauss e4cf6998a1 [core] limit initial response header backend read 2021-10-01 06:39:47 -04:00
Glenn Strauss cb7deb493c [core] remove obsolete comment about r->gw_dechunk
r->gw_dechunk->b is not a candidate for using generic chunk buffers.
chunked headers are generally smaller and fit in default 64 byte alloc.
Also, lighttpd limits chunked header to 1k.

Avoid unneeded optimization since HTTP/1.1 use is likely to diminish
over time in favor of HTTP/2 or HTTP/3 or later.
2021-09-30 17:34:03 -04:00
Glenn Strauss 6e62b84258 [core] splice() data from backends to tempfiles
splice() data from backends to tempfiles (where splice() is available);
reduce copying data to userspace when writing data to tempfiles

Note: splice() on Linux returns EINVAL if target file has O_APPEND set
so lighttpd uses pwrite() (where available) when writing to tempfiles
(instead of lseek() + write(), or O_APPEND and write())
2021-09-30 17:34:03 -04:00
Glenn Strauss f19f71625c [multiple] internal control for backend read bytes
separate internal control for backend max_per_read

When not streaming, large reads will be flushed to temp files on disk.
When streaming, use a smaller buffer to help reduce memory usage.

When not streaming, attempt to read and empty kernel socket bufs.
  (e.g. MAX_READ_LIMIT 256k)

When writing to sockets (or pipes) attempt to fill kernel socket bufs.
  (e.g. MAX_WRITE_LIMIT 256k)
2021-09-28 11:05:55 -04:00
Glenn Strauss d59d5e59b9 [core] improve chunk buffer reuse from backends
improve oversized chunk buffer reuse from backends
2021-09-24 10:44:50 -04:00
Glenn Strauss 0b56c16a8b [core] reduce oversized mem alloc for backends
reduce oversized memory allocations when reading from backends:
avoid extra power-2 allocation for 1 byte ('\0') when data
available to read is exactly power-2
2021-09-16 04:40:12 -04:00
Glenn Strauss f99cb7d7ab [core] quiet coverity warnings 2021-09-09 04:08:26 -04:00
Glenn Strauss a3e9faa479 [multiple] quiet coverity warnings 2021-09-09 02:16:21 -04:00
Glenn Strauss da562e3fd6 [core] http_response_read() indicate resp finished
return HANDLER_FINISHED from http_response_read() if response finished,
whether due to reading EOF (prior behavior), or if Content-Length was
provided and we have finished reading Content-Length, or if a module
sets r->resp_body_finished for any other reason.  This may save an
unnecessary poll() and read() to receive EOF when Content-Length has
already been read.
2021-09-08 15:06:06 -04:00
Glenn Strauss 39a577985f [core] improve handling of suboptimal backend wr
more efficiently handle reading of suboptimal backend write behavior

check to accumulate small reads in mem before flushing to temp file
2021-09-08 15:06:06 -04:00
Glenn Strauss 833d658729 [core] http_response_append_{buffer,mem}()
manage r->resp_body_scratchpad in new funcs
rather than
which now only decode chunked encoding, more apropos for the func names
2021-09-08 15:06:06 -04:00
Glenn Strauss 2ef31a1b3f [core] chunkqueue_append_buffer always clears buf
chunkqueue_append_buffer() always clears buffer
(instead of relying on caller to do so after the call)
2021-09-08 15:06:06 -04:00
Glenn Strauss 4f96dac841 [core] gw_backend_error() shared code 2021-09-04 08:08:27 -04:00
Glenn Strauss dbf7588147 [core] tune http_response_process_headers()
- rearrange some code for better CPU cache use
- use http_header_str_contains_token()
2021-09-04 08:08:26 -04:00
Glenn Strauss 309c1693ac [multiple] Y2038 32-bit signed time_t mitigations
Most OS platforms have already provided solutions to
Y2038 32-bit signed time_t 5 - 10 years ago (or more!)
Notable exceptions are Linux i686 and FreeBSD i386.

Since 32-bit systems tend to be embedded systems,
and since many distros take years to pick up new software,
this commit aims to provide Y2038 mitigations for lighttpd
running on 32-bit systems with Y2038-unsafe 32-bit signed time_t

* Y2038: lighttpd 1.4.60 and later report Y2038 safety
  $ lighttpd -V
    + Y2038 support                                    # Y2038-SAFE
  $ lighttpd -V
    - Y2038 support (unsafe 32-bit signed time_t)      # Y2038-UNSAFE

* Y2038: general platform info
  * Y2038-SAFE: lighttpd 64-bit builds on platforms using 64-bit time_t
      - all major 64-bit platforms (known to this author) use 64-bit time_t
  * Y2038-SAFE: lighttpd 32-bit builds on platforms using 64-bit time_t
      - Linux x32 ABI (different from i686)
      - FreeBSD all 32-bit and 64-bit architectures *except* 32-bit i386
      - NetBSD 6.0 (released Oct 2012) all 32-bit and 64-bit architectures
      - OpenBSD 5.5 (released May 2014) all 32-bit and 64-bit architectures
      - Microsoft Windows XP and Visual Studio 2005 (? unsure ?)
        Another reference suggests Visual Studio 2015 defaults to 64-bit time_t
      - MacOS 10.15 Catalina (released 2019) drops support for 32-bit apps
  * Y2038-SAFE: lighttpd 32-bit builds on platforms using 32-bit unsigned time_t
      - e.g. OpenVMS (unknown if lighttpd builds on this platform)
  * Y2038-UNSAFE: lighttpd 32-bit builds on platforms using 32-bit signed time_t
      - Linux 32-bit (including i686)
          - glibc 32-bit library support not yet available for 64-bit time_t
              - Linux kernel 5.6 on 32-bit platforms does support 64-bit time_t
                "Note: at this point, 64-bit time support in dual-time
                 configurations is work-in-progress, so for these
                 configurations, the public API only makes the 32-bit time
                 support available. In a later change, the public API will
                 allow user code to choose the time size for a given
                 compilation unit."
              - compiling with -D_TIME_BITS=64 currently has no effect
          - glibc recent (Jul 2021) mailing list discussion
      - FreeBSD i386
      - DragonFlyBSD 32-bit

* Y2038 mitigations attempted on Y2038-UNSAFE platforms (32-bit signed time_t)
  * lighttpd prefers system monotonic clock instead of realtime clock
    in places where realtime clock is not required
  * lighttpd treats negative time_t values as after 19 Jan 2038 03:14:07 GMT
  * (lighttpd presumes that lighttpd will not encounter dates before 1970
    during normal operation.)
  * lighttpd casts struct stat st.st_mtime (and st.st_*time) through uint64_t
    to convert negative timestamps for comparisions with 64-bit timestamps
    (treating negative timestamp values as after 19 Jan 2038 03:14:07 GMT)
  * lighttpd provides unix_time64_t (int64_t) and
  * lighttpd provides struct unix_timespec64 (unix_timespec64_t)
    (struct timespec equivalent using unix_time64_t tv_sec member)
  * lighttpd provides gmtime64_r() and localtime64_r() wrappers
    for platforms 32-bit platforms using 32-bit time_t and
    lighttpd temporarily shifts the year in order to use
    gmtime_r() and localtime_r() (or gmtime() and localtime())
    from standard libraries, before readjusting year and passing
    struct tm to formatting functions such as strftime()
  * lighttpd provides TIME64_CAST() macro to cast signed 32-bit time_t to
    unsigned 32-bit and then to unix_time64_t

* Note: while lighttpd tries handle times past 19 Jan 2038 03:14:07 GMT
  on 32-bit platforms using 32-bit signed time_t, underlying libraries and
  underlying filesystems might not behave properly after 32-bit signed time_t
  overflows (19 Jan 2038 03:14:08 GMT).  If a given 32-bit OS does not work
  properly using negative time_t values, then lighttpd likely will not work
  properly on that system.

* Other references and blogs
2021-09-04 08:08:26 -04:00
Glenn Strauss af3df29ae8 [multiple] reduce redundant NULL buffer checks
This commit is a large set of code changes and results in removal of
hundreds, perhaps thousands, of CPU instructions, a portion of which
are on hot code paths.

Most (buffer *) used by lighttpd are not NULL, especially since buffers
were inlined into numerous larger structs such as request_st and chunk.

In the small number of instances where that is not the case, a NULL
check is often performed earlier in a function where that buffer is
later used with a buffer_* func.  In the handful of cases that remained,
a NULL check was added, e.g. with r->http_host and r->conf.server_tag.

- check for empty strings at config time and set value to NULL if blank
  string will be ignored at runtime; at runtime, simple pointer check
  for NULL can be used to check for a value that has been set and is not
  blank ("")
- use buffer_is_blank() instead of buffer_string_is_empty(),
  and use buffer_is_unset() instead of buffer_is_empty(),
  where buffer is known not to be NULL so that NULL check can be skipped
- use buffer_clen() instead of buffer_string_length() when buffer is
  known not to be NULL (to avoid NULL check at runtime)
- use buffer_truncate() instead of buffer_string_set_length() to
  truncate string, and use buffer_extend() to extend

Examples where buffer known not to be NULL:
  - cpv->v.b from config_plugin_values_init is not NULL if T_CONFIG_BOOL
    (though we might set it to NULL if buffer_is_blank(cpv->v.b))
  - address of buffer is arg (&foo)
    (compiler optimizer detects this in most, but not all, cases)
  - buffer is checked for NULL earlier in func
  - buffer is accessed in same scope without a NULL check (e.g. b->ptr)

internal behavior change:
  callers must not pass a NULL buffer to some funcs.
  - buffer_init_buffer() requires non-null args
  - buffer_copy_buffer() requires non-null args
  - buffer_append_string_buffer() requires non-null args
  - buffer_string_space() requires non-null arg
2021-08-27 02:16:53 -04:00
Glenn Strauss c8820d2ecc [core] code reuse with array_match_value_prefix()
use array_match_value_prefix() when checking xdocroot
2021-08-27 02:16:53 -04:00
Glenn Strauss 0532d67639 [core] document error edge case for HTTP/1.0
When lighttpd is not configured to stream the response body,
lighttpd sends partial content with an incorrect Content-Length
to an HTTP/1.0 client if a backend sends Transfer-Encoding: chunked
in response to lighttpd HTTP/1.1 request (to backend), and the response
from the backend ends up truncated.

lighttpd could instead send an HTTP/1.0 502 Bad Gateway, but the
current implementation chooses to send the partial content.  After all,
an HTTP/1.0 client is, well, HTTP/1.0, and so of limited intelligence.
2021-05-14 20:45:59 -04:00
Glenn Strauss 980554bc70 [core] simplify buffer_path_simplify() 2021-05-08 14:34:05 -04:00
Glenn Strauss b288eeafaa [core] http_response_send_file() mark cold paths 2021-05-06 17:35:00 -04:00
Glenn Strauss 6c40f997b9 [core] merge http_response_send_file 0-size case
merge http_response_send_file 0-sized file special case
(historically was a short-circuit before Range handling,
 but Range handling has been rewritten and moved elsewhere)
2021-05-06 17:35:00 -04:00
Glenn Strauss 9a5e1652be [multiple] static file optimization; reuse cache
reuse cache lookup in common case of serving a static file
rather than repeating the stat_cache_entry lookup
(which is more work than memcmp() to re-check stat_cache_entry match)
2021-05-06 17:35:00 -04:00
Glenn Strauss a1eba3c89b [core] reuse code to parse backend response
reuse code to parse backend response (http_header_parse_hoff())
2021-04-27 15:13:40 -04:00
Glenn Strauss 325d89b99f [multiple] more reuse of http_date_time_to_str() 2021-04-05 13:24:51 -04:00
Glenn Strauss dc01487ea6 [multiple] use buffer_append_* aggregates
reduces the number of round-trips into some frequently-called routines
2021-04-02 01:16:40 -04:00
Glenn Strauss a6d1dccad3 [multiple] strftime %F and %T
strftime %F for %Y-%m-%d, and %T for %H:%M:%S
2021-04-02 01:16:08 -04:00
Glenn Strauss 26f354cb37 [multiple] http_header APIs to reduce str copies 2021-03-26 22:38:36 -04:00
Glenn Strauss f5a62a0fd2 [core] http_response_handle_cachable() optim
short-circuit http_response_handle_cachable() if conditional request
headers are not present
2021-03-13 06:21:06 -05:00
Glenn Strauss c95f832f99 [core] http_cgi.[ch] CGI interfaces (RFC 3875)
collect Common Gateway Interface (CGI) interfaces (RFC 3875)
2021-03-07 04:38:34 -05:00
Glenn Strauss 1f96e59d03 [core] http_cgi_local_redir() rename
rename http_response_process_local_redir() -> http_cgi_local_redir()

adjust some checks for local redir
2021-03-04 17:52:01 -05:00
Glenn Strauss cc35c03c3c [core] RFC 7233 Range handling for non-streaming
RFC 7233 Range handling for all non-streaming responses,
including (non-streaming) dynamic responses

(previously Range responses handled only for static files)
2021-03-02 10:14:25 -05:00
Glenn Strauss 1ca25d4e2c [core] 101 upgrade fails if Content-Length incl (fixes #3063)
(thx daimh)

commit 903024d7 in lighttpd 1.4.57 fixed issue #3046 but in the process
broke HTTP/1.1 101 Switching Protocols which included Content-Length: 0
in the response headers.  Content-Length response header is permitted
by the RFCs, but not necessary with HTTP status 101 Switching Protocols.

  "websocket proxy fails if 101 Switching Protocols from backend includes Content-Length"
2021-02-04 00:22:12 -05:00
Glenn Strauss 649829f906 [mod_cgi] fix assert if empty X-Sendfile path (fixes #3062)
(thx axe34)

Please note that this would not have crashed "x-sendfile-docroot"
were set to restrict the locations of files that can be sent via
X-Sendfile.  If users are untrusted, "x-sendfile" should not be
enable without also configuring "x-sendfile-docroot"

  "Server Aborted due to Malicious Data sent through CGI Sendfile"
2021-02-01 04:11:38 -05:00
Glenn Strauss 891007fb6a [multiple] use HTTP_HEADER_* enum before strcmp
When known, use HTTP_HEADER_* enum before string comparisons
2021-01-07 08:58:30 -05:00
Glenn Strauss 72b9bb5ba3 [core] http_response_match_if_range()
separate func to check "If-Range"
2020-12-26 20:00:42 -05:00
Glenn Strauss b700a8ca09 [multiple] etag.[ch] -> http_etag.[ch]; better imp
more efficient implementation of HTTP ETag generation and comparison

modify dekhash() to take hash value to allow for incremental hashing
2020-12-25 14:41:16 -05:00
Glenn Strauss 1212f60991 buffer_append_path_len() to join paths
use buffer_append_path_len() to join path segments
2020-12-24 16:13:20 -05:00
Glenn Strauss 122094e3e3 [multiple] employ http_date.h, sys-time.h
- replace use of strptime() w/ implementation specialized for HTTP dates
- use thread-safe gmtime_r(), localtime_r() (replace localtime, gmtime)
2020-12-24 16:13:20 -05:00
Glenn Strauss 7ba521ffb4 [core] reuse large mem chunks (fix mem usage) (fixes #3033)
(thx flynn)

fix large memory usage for large file downloads from dynamic backends

reuse or release large memory chunks

  "Memory Growth with PUT and full buffered streams"
2020-12-24 00:20:27 -05:00
Glenn Strauss 76faed9145 [multiple] replace fall through comment with attr
replace /* fall through */ comment with __attribute_fallthrough__ macro

Note: not adding attribute to code with external origins:
  xxhash.h (algo_xxhash.h)
so to avoid warnings, may need to compile with -Wno-implicit-fallthrough
2020-12-16 05:16:25 -05:00
Glenn Strauss 903024d711 [core] track Content-Length from backend (fixes #3046)
track Content-Length from backend in r->resp_body_scratchpad

  "Failure on second request in http proxy backend"
2020-12-16 02:00:17 -05:00
Glenn Strauss 47aa6d4ac8 [core] http_response_buffer_append_authority()
make public func for benefit of external, third-party mod_authn_tkt
2020-11-10 06:10:27 -05:00
Glenn Strauss 169d8d3608 [core] accept "HTTP/2.0", "HTTP/3.0" from backends (fixes #3031)
accept "HTTP/2.0" and "HTTP/3.0" NPH from naive non-proxy backends

(thx flynn)

  "uwsgi fails with HTTP/2"
2020-11-09 19:00:58 -05:00
Glenn Strauss 019c513819 [multiple] use http_chunk_append_file_ref()
use http_chunk_append_file_ref() and http_chunk_append_file_ref_range()

reduce resource usage (number of fds open) by reference counting open
fds to files served, and sharing the fd among FILE_CHUNKs in responses
2020-10-20 11:51:48 -04:00
Glenn Strauss 96abd9cfb8 [core] coalesce nearby ranges in Range requests
Range requests must be HTTP/1.1 or later (not HTTP/1.0)
2020-10-19 21:40:14 -04:00
Glenn Strauss 66d1ec485c [core,mod_deflate] leverage cache of open fd
leverage simple cache of open file in stat_cache
(use stat_cache_get_entry_open())

future: reference count fd instead of dup()
  (requires extending chunkqueue interfaces)
2020-10-19 21:40:14 -04:00
Glenn Strauss e99126074c [core] pass open fd to http_response_parse_range 2020-10-13 22:31:10 -04:00
Glenn Strauss 6219b861ce [core] http_response_parse_range() const file sz 2020-10-13 22:31:10 -04:00