This is happening because Apache is using persistent connections with the Apache proxy for PHP-FPM and at some point (perhaps when the site is under load) is trying to proxy more requests to PHP than is available. Apache maintains a pool of these proxied connections, and once this scenario happens the pool now contains one or more connections that essentially failed to connect to PHP-FPM. Due to the enablereuse setting, this broken connection still remains in the pool, so randomly down the road when Apache tries to use that connection again, the same thing happens where it results in a gateway time out when
ProxySet timeout elapses since it really never had a working PHP-FPM connection. That is a quick explanation as to why it happens and why it continues to happen randomly once it starts.
Persistant Proxy Connections
The reason why the Apache Proxy is using persistent connections to PHP-FPM is due to this line Apache conf file where the proxy is setup:
Apache's proxypass documentation explains everything, but to summarize when enablereuse is enabled each connection to the proxied backend is kept open, if enablereuse is disabled then it will force the connection to immediately close after it is used.
I did some testing and if I disabled enablereuse, the problem went away. Thus when a situation happens where there are no available PHP-FPM connections for a proxy to use, it may fail for that attempt, but immediately the connection is killed off and is not reused for others.
enablereuse setting can mostly solve the problem in that your proxy pool won't become corrupted with bad connections that will affect future requests. However, you will still potentially have a few failures from time to time (when under load), but at least it doesn't start happening all of the time from that point forward.
While the above resolution is quick, but not a perfect fix, I actually also wanted to use persistent connections to reduce the constant overhead of setting up new connections.
So why is Apache trying to proxy more requests to PHP-FPM than is available?
Limit Allowed Persistent Proxied Connections
We can solve all of the issues by ensuring that Apache never tries to proxy more requests to PHP-FPM than PHP-FPM has available. Remember that for this situation PHP-FPM essentially has 30 workers available due to the PHP-FPM setup:
pm = static
pm.max_children = 30
So if we want to use persistent connections, we need to ensure that Apache never tries to exceed 30 proxied requests at any given time.
Another configuration item you can set in Apache's Proxy directive is for max. The documentation above explains this as:
Maximum number of connections that will be allowed to the backend server. The default for this limit is the number of threads per process in the active MPM. In the Prefork MPM, this is always 1, while with other MPMs, it is controlled by the ThreadsPerChild directive.
This is confusing because you will need to do a little bit of work to figure out how to set this in a way that Apache will never try to allocate more proxies in the pool than workers you have available for PHP-FPM. Here is what you need to know:
- Must be less than or equal to the number of PHP-FPM workers available.
- Must account for the number of HTTP processes handling threads since max is set for each Apache process/server.
So for example in our situation you can see that we have the following set for Apache:
You will need to read the documentation to really understand what this means, as it's not intuitive, but here is a quick summary:
- MaxRequestWorkers is the max allowed workers to handle requests
- ServerLimit is the max number of Apache processes/servers that can be spawned
- ThreadsPerChild is the max number of threads each process/server can have
With that said for our scenario:
200 servers * 25 ThreadsPerChild = 5000 threads to handle requests
However, since we have
MaxRequestWorkers set to 200, and each server/process can have up to 25 threads, then only 8 servers/processes will ever be spawned. We will never even get close to the 200 ServerLimit set:
200 MaxRequestWorkers / 25 ThreadsPerChild = 8 Servers
Also from above knowing that since we never set max in the proxy part of the configuration, according to the documentation max will then be set to ThreadsPerChild which is 25.
So with all of that said we can calculate that the maximum number of proxied requests that Apache may try to create for PHP-FPM is 200:
8 Servers * 25 ThreadsPerChild = 200 Potential Proxied Requests issued
200! We only have 30 PHP-FPM workers to handle these proxy requests, this will eventually result in the problem above where you will find in your error log the following:
[Thu Dec 08 04:22:06.264423 2022] [proxy_fcgi:error] [pid 1916665:tid 1916987] (70007)The timeout specified has expired: [client 22.214.171.124:9248] AH01075: Error dispatching request to :443: (polling)
In our original setup we had
ProxySet timeout=240, which means when someone visits your website that uses one of these broken proxied connections, it will spin for 4 minutes (240 seconds) before showing the Gateway Timeout error.
We have to make sure that however you configure your variables that the end result is 30 or fewer proxied requests. If Apache even tries to proxy 1 request more than the limit, it will eventually result in this failure. I know this, because I tested it with Apache's Benchmark AB tool:
ab -n 1000 -c 20 http://www.website.com/
While running this it eventually hangs and results in failures and if you examine Apache's logs you will find all sorts of timeouts eventually showing up. As long as you make sure the proxied requests is always less than what is allocated with PHP-FPM, I confirmed that the problem never happens. So with that said here is how the configuration was changed:
Let's do the math here. With the ServerLimit at 2, and ThreadsPerChild at 100, the maximum number of threads that will ever be created is 200. We also set MaxRequestWorkers to match this number to be consistent.
2 Servers * 100 Threads = 200 Threads Total
Now to figure out how to set the max proxied requests per server/process that will be allowed, remember its simply the number PHP-FPM process available divided by the number of Apache servers/processes:
30 PHP Workers / 2 Servers = 15 Max per process
Thus that is why we set
ProxySet max=15. In other words, the most Apache servers that will ever be created is 2 and the most proxied request per server is 15 which means that in total the most that can ever be proxied at any given time is 30. Since this is equal to or less than the number or PHP-FPM workers available, the problem is now gone!
I have confirmed this with benchmarking as described above, and this allows you to continue to use persistent connections without running into the dreaded (70007)The timeout specified has expired error.