I've run into a strangeness with my TLS-based web server. It seems that, for every incoming request, three file descriptors are used (they all seem to be sockets), and they aren't immediately cleaned up.
With keep-alive disabled, they CAN be disposed of, but it takes an explicit gc() call. With keep-alive active, they aren't garbage, so they stick around even with a gc() call. (Which is probably correct, but there might need to be a limit on how many get retained.)
constant listen_addr = "::", listen_port = 9876;
void http_handler(Protocols.HTTP.Server.Request req) { int before = sizeof(get_dir("/proc/self/fd")); int garbo = gc(); //Disabling this check results in FDs accumulating. int after = sizeof(get_dir("/proc/self/fd")); werror("Garbage %O, closed %d files, now %d open\n", garbo, before - after, after); //The "Connection: close" header is vital to the sockets becoming garbage. //Without it, they are retained pending a followup request, which doesn't change the //fundamental issue but does mean that a call to gc() doesn't clean them up. req->response_and_finish((["data": "OK", "extra_heads": (["Connection": "close"])])); //req->response_and_finish((["data": "OK"])); }
int main() { //If you don't have a cert, the first request is slower b/c generating self-signed. //This ONLY happens with SSL connections. There are *three* file descriptors wasted //for every request. string cert = Stdio.read_file("certificate_local.pem"); string key = Stdio.read_file("privkey_local.pem"); array certs = cert && Standards.PEM.Messages(cert)->get_certificates(); string pk = key && Standards.PEM.simple_decode(key); Protocols.HTTP.Server.SSLPort(http_handler, listen_port, listen_addr, pk, certs); return -1; }
What are the three FDs used for? Presumably one of them is the actual incoming socket, but the other two are less obvious.
This phenomenon does not seem to happen with non-SSL ports, possibly relating to the simpler shutdown sequence for unencrypted sockets.
This isn't usually a major problem, as the GC does get run eventually, but under certain types of workloads, it's easy to overload it and run out of FDs. This can be triggered by removing the gc() call from each request, and then any of these will spin up until the process FD limit is hit:
// Pike: while (1) Protocols.HTTP.get_url_data("https://YOUR_SERVER.EXAMPLE:9876/");
# Python: import requests while 1: requests.get("https://YOUR_SERVER.EXAMPLE:9876/")
: Bash: while wget -qO/dev/null https://YOUR_SERVER.EXAMPLE:9876/; do true; done
// JavaScript in a web browser, if not blocked: while (1) await(fetch("https://YOUR_SERVER.EXAMPLE:9876/"));
So I think it's probably not purely an issue with a client library misbehaving.
Is there a better way to handle this than simply forcing garbage collection every request?
ChrisA
Hi Chris.
I've run into a strangeness with my TLS-based web server. It seems that, for every incoming request, three file descriptors are used (they all seem to be sockets), and they aren't immediately cleaned up.
With keep-alive disabled, they CAN be disposed of, but it takes an explicit gc() call. With keep-alive active, they aren't garbage, so they stick around even with a gc() call. (Which is probably correct, but there might need to be a limit on how many get retained.)
[...]
What are the three FDs used for? Presumably one of them is the actual incoming socket, but the other two are less obvious.
This phenomenon does not seem to happen with non-SSL ports, possibly relating to the simpler shutdown sequence for unencrypted sockets.
This isn't usually a major problem, as the GC does get run eventually, but under certain types of workloads, it's easy to overload it and run out of FDs. This can be triggered by removing the gc() call from each request, and then any of these will spin up until the process FD limit is hit:
[...]
So I think it's probably not purely an issue with a client library misbehaving.
Is there a better way to handle this than simply forcing garbage collection every request?
The best way is to break the cycle before dropping the object on the floor (or to not introduce it to begin with) or destruct the problematic objects explicitly.
To identify the cycles you may want to log the cycles detected by the gc; take a look at base_server/roxen.pike:reinstall_gc_callbacks().
/grubba
pike-devel@lists.lysator.liu.se