Commit Graph

55 Commits

Author SHA1 Message Date
Glenn Strauss 042622c8c1 [core] use pread() to skip lseek() 2021-10-01 06:39:47 -04:00
Glenn Strauss f19f71625c [multiple] internal control for backend read bytes
separate internal control for backend max_per_read

When not streaming, large reads will be flushed to temp files on disk.
When streaming, use a smaller buffer to help reduce memory usage.

When not streaming, attempt to read and empty kernel socket bufs.
  (e.g. MAX_READ_LIMIT 256k)

When writing to sockets (or pipes) attempt to fill kernel socket bufs.
  (e.g. MAX_WRITE_LIMIT 256k)
2021-09-28 11:05:55 -04:00
Glenn Strauss e97a5b7e49 [core] clear buffer after backend dechunk
(thx flynn)

clear buffer after backend dechunk if not sending chunked to client

  "Memory fragmentation with HTTP/2 enabled"
2021-09-09 16:12:37 -04:00
Glenn Strauss 67c0b1498a [multiple] remove base.h include where not used
(substitute request.h if file only accesses request_st,
 and not connection or server structs)
2021-09-08 15:06:07 -04:00
Glenn Strauss 833d658729 [core] http_response_append_{buffer,mem}()
manage r->resp_body_scratchpad in new funcs
rather than
which now only decode chunked encoding, more apropos for the func names
2021-09-08 15:06:06 -04:00
Glenn Strauss 1ca721d479 [core] reduce excess cc inlining in http_chunk.c 2021-08-27 02:16:54 -04:00
Glenn Strauss af3df29ae8 [multiple] reduce redundant NULL buffer checks
This commit is a large set of code changes and results in removal of
hundreds, perhaps thousands, of CPU instructions, a portion of which
are on hot code paths.

Most (buffer *) used by lighttpd are not NULL, especially since buffers
were inlined into numerous larger structs such as request_st and chunk.

In the small number of instances where that is not the case, a NULL
check is often performed earlier in a function where that buffer is
later used with a buffer_* func.  In the handful of cases that remained,
a NULL check was added, e.g. with r->http_host and r->conf.server_tag.

- check for empty strings at config time and set value to NULL if blank
  string will be ignored at runtime; at runtime, simple pointer check
  for NULL can be used to check for a value that has been set and is not
  blank ("")
- use buffer_is_blank() instead of buffer_string_is_empty(),
  and use buffer_is_unset() instead of buffer_is_empty(),
  where buffer is known not to be NULL so that NULL check can be skipped
- use buffer_clen() instead of buffer_string_length() when buffer is
  known not to be NULL (to avoid NULL check at runtime)
- use buffer_truncate() instead of buffer_string_set_length() to
  truncate string, and use buffer_extend() to extend

Examples where buffer known not to be NULL:
  - cpv->v.b from config_plugin_values_init is not NULL if T_CONFIG_BOOL
    (though we might set it to NULL if buffer_is_blank(cpv->v.b))
  - address of buffer is arg (&foo)
    (compiler optimizer detects this in most, but not all, cases)
  - buffer is checked for NULL earlier in func
  - buffer is accessed in same scope without a NULL check (e.g. b->ptr)

internal behavior change:
  callers must not pass a NULL buffer to some funcs.
  - buffer_init_buffer() requires non-null args
  - buffer_copy_buffer() requires non-null args
  - buffer_append_string_buffer() requires non-null args
  - buffer_string_space() requires non-null arg
2021-08-27 02:16:53 -04:00
Glenn Strauss 5ff9e2f6eb [core] remove some (now) unused http_chunk APIs
remove http_chunk_append_file() and http_chunk_append_file_range()

callers should choose to use stat_cache_entry_open() for caching
or should open file and check sizes and ranges
2021-05-14 03:43:41 -04:00
Glenn Strauss 1ce8220947 [core] range chk http_chunk_append_file_ref_range
add range sanity check in http_chunk_append_file_ref_range()
(before potentially sending HTTP/1.1 chunked header)
2021-05-14 03:43:18 -04:00
Glenn Strauss e8de53cb74 [core] fix chunkqueue_small_resp_optim partial rd 2021-03-22 07:10:30 -04:00
Glenn Strauss cabced1f9f [core] fix decoding chunked from backend (fixes #3049)
(thx flynn)

fix decoding chunked from backend

truncate response and error out if backend sends excess data
after chunked encoding

  "Too much content with HTTP/2.0"
2020-12-17 03:59:41 -05:00
Glenn Strauss 903024d711 [core] track Content-Length from backend (fixes #3046)
track Content-Length from backend in r->resp_body_scratchpad

  "Failure on second request in http proxy backend"
2020-12-16 02:00:17 -05:00
Glenn Strauss 167513c840 [core] track chunked encoding state from backend (fixes #3046)
(thx flynn)

track chunked encoding state when parsing backend response

  "Failure on second request in http proxy backend"
2020-12-14 19:59:08 -05:00
Glenn Strauss 3230c6ef17 [core] reject excess data after chunked encoding (#3046)
reject excess data after chunked encoding when parsing backend response

  "Failure on second request in http proxy backend"
2020-12-14 19:55:00 -05:00
Glenn Strauss 163cb8be28 [core] fix chunked decoding from backend (fixes #3044)
(thx flynn)

  "Socket errors after update to version 1.4.56"
2020-12-14 12:34:49 -05:00
Glenn Strauss edfc5f394e [core] consolidate chunk size checks 2020-11-27 08:12:21 -05:00
Glenn Strauss 1b74c50854 [core] always lseek() with shared fd
always lseek() with shared fd; remove optim to skip with offset = 0
2020-10-20 11:51:48 -04:00
Glenn Strauss 9078cc4ce8 [core] http_chunk_append_file_ref_range()
http_chunk_append_file_ref() and http_chunk_append_file_ref_range()
to take stat_cache_entry ref and append FILE_CHUNK
2020-10-20 11:51:48 -04:00
Glenn Strauss 9c25581d6f [core] alloc optim reading file, sending chunked
avoid potential double-copy due to not enough space for final '\0'
in http_chunk_append_read_fd_range() if read size is exactly multiple
of 8k and sending chunked response
2020-10-19 21:40:14 -04:00
Glenn Strauss e99126074c [core] pass open fd to http_response_parse_range 2020-10-13 22:31:10 -04:00
Glenn Strauss 81029b8b51 [multiple] inline chunkqueue where always alloc'd
inline struct chunkqueue where always allocated in other structs

(memory locality)
2020-10-11 12:19:27 -04:00
Glenn Strauss 97e314fc9e [multiple] inline chunkqueue_length() 2020-10-11 12:19:26 -04:00
Glenn Strauss 5fd8a26a75 [core] defer optimization to read small files
defer optimization to read small files into memory until after
response_start hooks have a chance to run, e.g. until after
mod_deflate chooses whether or not to serve file from compressed
cache, if deflate.cache-dir is configured
2020-10-11 12:19:24 -04:00
Glenn Strauss 7420526ddb [core] decode Transfer-Encoding: chunked from gw
decode Transfer-Encoding: chunked from gw (gateway backends)

Transfer-Encoding: chunked is a hop-by-hop header.

Handling chunked encoding remove a hurdle for mod_proxy to send HTTP/1.1
requests to backends and be able to handle HTTP/1.1 responses.

Other backends ought not to send Transfer-Encoding: chunked, but in
practice, some implementations do.
2020-08-02 07:47:42 -04:00
Glenn Strauss c18f442a63 [multiple] add summaries to top of some modules 2020-07-08 22:51:31 -04:00
Glenn Strauss 7c7f8c467c [multiple] split con, request (very large change)
NB: r->tmp_buf == srv->tmp_buf (pointer is copied for quicker access)

NB: request read and write chunkqueues currently point to connection
    chunkqueues; per-request and per-connection chunkqueues are
    not distinct from one another
      con->read_queue  == r->read_queue
      con->write_queue == r->write_queue

NB: in the future, a separate connection config may be needed for
    connection-level module hooks.  Similarly, might need to have
    per-request chunkqueues separate from per-connection chunkqueues.
    Should probably also have a request_reset() which is distinct from
2020-07-08 19:54:29 -04:00
Glenn Strauss 68d8d4c532 [multiple] stat_cache singleton 2020-07-08 19:54:28 -04:00
Glenn Strauss 0fcd51438d [core] create http chunk header on the stack
streamline code in http_chunk.c
2020-07-08 19:54:28 -04:00
Glenn Strauss 010c28949c [multiple] prefer (connection *) to (srv *)
convert all log_error_write() to log_error() and pass (log_error_st *)

use con->errh in preference to srv->errh (even though currently same)

avoid passing (server *) when previously used only for logging (errh)
2020-07-08 19:54:28 -04:00
Glenn Strauss 2ec70f234a [core] stat_cache_path_contains_symlink use errh
use log_error() with con->errh
2020-07-08 18:08:52 -04:00
Glenn Strauss ef0a211733 [core] adjust http_chunk read() retry loop 2019-05-18 13:02:59 -04:00
Glenn Strauss a86ea83b5a [core] chunkqueue perf: read small files into mem 2019-05-13 21:01:57 -04:00
Glenn Strauss 470a692211 [core] http_chunk_append_file_fd() 2019-05-04 13:48:22 -04:00
Glenn Strauss 73bfee6308 [stat_cache] separate func for symlink policy chk
Note: historical ToC-ToU race condition still exists in implementation
server.follow-symlink = "disable" is not recommended (default: "enable")
2019-04-29 18:11:15 -04:00
Glenn Strauss 37bd124ae4 [core] pass conf.follow_symlink in more places 2019-03-10 23:22:58 -04:00
Glenn Strauss f69bd9cdb8 [core] perf: simple, quick buffer_clear()
quickly clear buffer instead of buffer_string_set_length(b, 0) or
buffer_reset(b).  Avoids free() of large buffers about to be reused,
or buffers that are module-scoped, persistent, and reused.

(buffer_reset() should still be used with buffers in connection *con
 when the data in the buffers is supplied by external, untrusted source)
2018-11-23 00:37:38 -05:00
Glenn Strauss c79bc31609 [mod_fastcgi] perf: reduce data copies
2018-11-12 08:25:05 -05:00
Glenn Strauss e7c840502a [core] perf: copy small strings; better buf reuse
copy small strings to write queue for better buffer reuse
(instead of swapping with larger buffers in write chunkqueue)
2018-10-27 14:00:08 -04:00
Glenn Strauss 9725299587 [core] code reuse with http_response_body_clear()
code reuse with http_response_body_clear()
rename con->response.transfer_encoding to con->response.send_chunked
2018-09-23 18:01:58 -04:00
Glenn Strauss 04d76e7afd [core] some header cleanup
provide standard types in first.h instead of base.h
provide lighttpd types in base_decls.h instead of settings.h
reduce headers exposed by headers for core data structures
  do not expose <pcre.h> or <stdlib.h> in headers
move stat_cache_entry to stat_cache.h
reduce use of "server.h" and "base.h" in headers
2018-04-08 22:22:23 -04:00
Glenn Strauss ba953cdf45 [core] include "fdevent.h" where needed
(instead of providing #include "fdevent.h" in base.h)
2017-03-28 02:17:33 -04:00
Glenn Strauss a53f662a30 [core] remove some unused header includes
remove exposure of stdio.h in buffer.h for print_backtrace(), now static
2017-03-28 02:17:33 -04:00
Glenn Strauss 18a7b2be37 [core] option to stream response body to client (fixes #949, #760, #1283, #1387)
Set = 1 or = 2
to have lighttpd stream response body to client as it arrives from the
backend (CGI, FastCGI, SCGI, proxy).

default: buffer entire response body before sending response to client.
(This preserves existing behavior for now, but may in the future be
 changed to stream response to client, which is the behavior more
 commonly expected.)

  "fastcgi, cgi, flush, php5 problem."
  "Random crashing on FreeBSD 6.1"
  "Memory usage increases when proxy+ssl+large file"
  "lighttpd+fastcgi memory problem"
2016-06-19 23:34:16 -04:00
Glenn Strauss 5a91fd4b90 [core] buffer large responses to tempfiles (fixes #758, fixes #760, fixes #933, fixes #1387, #1283, fixes #2083)
This replaces buffering entire response in memory which might lead to
huge memory footprint and possibly to memory exhaustion.

use tempfiles of fixed size so disk space is freed as each file sent

update callers of http_chunk_append_mem() and http_chunk_append_buffer()
to handle failures when writing to tempfile.

  "memory fragmentation leads to high memory usage after peaks"
  "Random crashing on FreeBSD 6.1"
  "lighty should buffer responses (after it grows above certain size) on disk"
  "Memory usage increases when proxy+ssl+large file"
  "lighttpd+fastcgi memory problem"
  "Excessive Memory usage with streamed files from PHP"
2016-06-12 02:51:10 -04:00
Glenn Strauss a65c57a548 [core] open fd when appending file to cq (fixes #2655)
http_chunk_append_file() opens fd when appending file to chunkqueue.
Defers calculation of content length until response is finished.

This reduces race conditions pertaining to stat() and then (later)
open(), when the result of the stat() was used for Content-Length
or to generate chunked headers.

Note: this does not change how lighttpd handles files that are modified
in-place by another process after having been opened by lighttpd --
don't do that.  This *does* improve handling of files that are
frequently modified via a temporary file and then atomically renamed
into place.

mod_fastcgi has been modified to use http_chunk_append_file_range() with
X-Sendfile2 and will open the target file multiple times if there are
multiple ranges.

Note: (future todo) not implemented for chunk.[ch] interfaces used by
range requests in mod_staticfile or by mod_ssi.  Those uses could lead
to too many open fds.  For mod_staticfile, limits should be put in place
for max number of ranges accepted by mod_staticfile.  For mod_ssi,
limits would need to be placed on the maximum number of includes, and
the primary SSI file split across lots of SSI directives should either
copy the pieces or perhaps chunk.h could be extended to allow for an
open fd to be shared across multiple chunks.  Doing either of these
would improve the performance of SSI since they would replace many file
opens on the pieces of the SSI file around the SSI directives.

  "Serving a file that is getting updated can cause an empty response or incorrect content-length error"

Closes #49
2016-04-18 04:27:08 -04:00
Glenn Strauss 8abd06a7ff consistent inclusion of config.h at top of files (fixes #2073)
From: Glenn Strauss <>

git-svn-id: svn:// 152afb58-edef-0310-8abb-c4023f1b3aa9
2016-03-19 15:14:35 +00:00
Stefan Bühler 91a9a6b391 rename buffer_append_long_hex to buffer_append_uint_hex
* takes uintmax_t now
* use in http_chunk_append_len

From: Stefan Bühler <>

git-svn-id: svn:// 152afb58-edef-0310-8abb-c4023f1b3aa9
2015-02-08 19:10:46 +00:00
Stefan Bühler ad3e93ea96 Use buffer API to read and modify "used" member
- a lot of code tried to handle manually adding terminating zeroes and
  keeping track of the correct "used" count.
  Replaced all "external" usages with simple wrapper functions:
  * buffer_string_is_empty (used <= 1), buffer_is_empty (used == 0);
    prefer buffer_string_is_empty
  * buffer_string_set_length
  * buffer_string_length
  * CONST_BUF_LEN() macro
- removed "static" buffer hacks (buffers pointing to constant/stack
  memory instead of malloc()ed data)
- buffer_append_strftime(): refactor buffer+strftime uses
- li_tohex(): no need for a buffer for binary-to-hex conversion:
  the output data length is easy to predict
- remove "-Winline" from extra warnings: the "inline" keyword just
  supresses the warning about unused but defined (static) functions;
  don't care whether it actually gets inlined or not.

From: Stefan Bühler <>

git-svn-id: svn:// 152afb58-edef-0310-8abb-c4023f1b3aa9
2015-02-08 19:10:44 +00:00
Stefan Bühler 4365bdbebe Remove buffer_prepare_copy() and buffer_prepare_append()
* removed almost all usages of buffer as "memory" (without terminating
* refactored cgi variable name encoding

From: Stefan Bühler <>

git-svn-id: svn:// 152afb58-edef-0310-8abb-c4023f1b3aa9
2015-02-08 19:10:39 +00:00
Stefan Bühler 6afad87d2e fix buffer, chunk and http_chunk API
* remove unused structs and functions
    (buffer_array, read_buffer)
  * change return type from int to void for many functions,
    as the return value (indicating error/success) was never checked,
    and the function would only fail on programming errors and not on
    invalid input; changed functions to use force_assert instead of
    returning an error.
  * all "len" parameters now are the real size of the memory to be read.
    the length of strings is given always without the terminating 0.
  * the "buffer" struct still counts the terminating 0 in ->used,
    provide buffer_string_length() to get the length of a string in a
    unset config "strings" have used == 0, which is used in some places
    to distinguish unset values from "" (empty string) values.
  * most buffer usages should now use it as string container.
  * optimise some buffer copying by "moving" data to other buffers
  * use (u)intmax_t for generic int-to-string functions
  * remove unused enum values: UNUSED_CHUNK, ENCODING_UNSET
  * converted BUFFER_APPEND_SLASH to inline function (no macro feature
  * refactor: create chunkqueue_steal: moving (partial) chunks into another
  * http_chunk: added separate function to terminate chunked body instead of
    magic handling in http_chunk_append_mem().
    http_chunk_append_* now handle empty chunks, and never terminate the
    chunked body.

From: Stefan Bühler <>

git-svn-id: svn:// 152afb58-edef-0310-8abb-c4023f1b3aa9
2015-02-08 12:37:10 +00:00