Asked
Viewed
57.2k times

I have an issue where randomly after a few hours to a few days I will start getting messages like this in the Apache error log:

[Thu Dec 08 04:22:06.264423 2022] [proxy_fcgi:error] [pid 1916665:tid 1916987] (70007)The timeout specified has expired: [client 185.220.101.47:9248] AH01075: Error dispatching request to :443: (polling)

When I visit the website while this is happening, and after hitting refresh a few times, eventually the browser's loading icon just spins, and then eventually when the timeout is hit it results in a Gateway Timeout that says: "The gateway did not receive a timely response from the upstream server or application.".

Gateway Timeout after proxy fails with PHP-FPM

The most interesting observations from this are:

  1. First few hours to a few days zero errors or gateway timeouts.
  2. Once errors start, it then happens randomly for visitors (maybe 1 in 5 visitors).
  3. No other errors except in Apache error_log.

The server that the client/visitor connects to contains both Apache and PHP. Apache is setup with the following configuration using Apache MPM Event:

StartServers            5
MaxRequestWorkers       200
ServerLimit             200
MinSpareThreads         25
MaxSpareThreads         75
ThreadLimit             64
ThreadsPerChild         25
MaxConnectionsPerChild  100000

PHP is set up as PHP-FPM where it handles its own requests outside of Apache. This means that Apache proxies any PHP-related requests to PHP-FPM. The proxied portion is setup in Apache like this:

<FilesMatch \.php$>
    SetHandler "proxy:unix:/run/php-fpm/php.sock|fcgi://php/"
</FilesMatch>

<Proxy "fcgi://php/" enablereuse=on flushpackets=on>
    ProxySet connectiontimeout=5 timeout=240
</proxy>

Finally, PHP-FPM is set up as static where it has a fixed number of child processes available to handle requests. This is set to 30:

pm = static
pm.max_children = 30

What I have found is that once this issue starts showing up in the logs, it will randomly keep continuing until I either:

  1. Restart the Apache Service
  2. Restart the PHP Service

As long as I restart at least one of those, everything goes back to normal until randomly it starts all over again a few hours to a few days later.

When you look up this problem on the internet, every discussion around this error is due to PHP running for long periods of time and thus hitting the timeout. This is not the case here, PHP finishes up at around 100-200ms. This error is not caused by a long-running PHP process.

PHP does also interact with both MySQL and Redis on a different remote server. Initially, I thought there was a chance something was getting hung up there contributing to this, but after doing some tests with a PHP script on the same server that does not connect to MySQL or Redis, the same thing happens. Thus that is ruled out.

What is causing the The timeout specified has expired with proxy_fcgi that results in an error dispatching the request to the client where the end result is a 503 Gateway error shown in the browser?

  • 0
    I have managed to figure out the solution to this, but I created the question because there was zero help available on the Internet, and I am hoping this will help someone else down the road. — Brian Wozeniak
add a comment
0

3 Answers

  • Votes
  • Oldest
  • Latest
Answered
Updated

This is happening because Apache is using persistent connections with the Apache proxy for PHP-FPM and at some point (perhaps when the site is under load) is trying to proxy more requests to PHP than is available. Apache maintains a pool of these proxied connections, and once this scenario happens the pool now contains one or more connections that essentially failed to connect to PHP-FPM. Due to the enablereuse setting, this broken connection still remains in the pool, so randomly down the road when Apache tries to use that connection again, the same thing happens where it results in a gateway time out when ProxySet timeout elapses since it really never had a working PHP-FPM connection. That is a quick explanation as to why it happens and why it continues to happen randomly once it starts.

Persistant Proxy Connections

The reason why the Apache Proxy is using persistent connections to PHP-FPM is due to this line Apache conf file where the proxy is setup:

enablereuse=on

Apache's proxypass documentation explains everything, but to summarize when enablereuse is enabled each connection to the proxied backend is kept open, if enablereuse is disabled then it will force the connection to immediately close after it is used.

I did some testing and if I disabled enablereuse, the problem went away. Thus when a situation happens where there are no available PHP-FPM connections for a proxy to use, it may fail for that attempt, but immediately the connection is killed off and is not reused for others.

Disabling the enablereuse setting can mostly solve the problem in that your proxy pool won't become corrupted with bad connections that will affect future requests. However, you will still potentially have a few failures from time to time (when under load), but at least it doesn't start happening all of the time from that point forward.

While the above resolution is quick, but not a perfect fix, I actually also wanted to use persistent connections to reduce the constant overhead of setting up new connections.

So why is Apache trying to proxy more requests to PHP-FPM than is available?

Limit Allowed Persistent Proxied Connections

We can solve all of the issues by ensuring that Apache never tries to proxy more requests to PHP-FPM than PHP-FPM has available. Remember that for this situation PHP-FPM essentially has 30 workers available due to the PHP-FPM setup:

pm = static
pm.max_children = 30

So if we want to use persistent connections, we need to ensure that Apache never tries to exceed 30 proxied requests at any given time.

Another configuration item you can set in Apache's Proxy directive is for max. The documentation above explains this as:

Maximum number of connections that will be allowed to the backend server. The default for this limit is the number of threads per process in the active MPM. In the Prefork MPM, this is always 1, while with other MPMs, it is controlled by the ThreadsPerChild directive.

This is confusing because you will need to do a little bit of work to figure out how to set this in a way that Apache will never try to allocate more proxies in the pool than workers you have available for PHP-FPM. Here is what you need to know:

  1. Must be less than or equal to the number of PHP-FPM workers available.
  2. Must account for the number of HTTP processes handling threads since max is set for each Apache process/server.

So for example in our situation you can see that we have the following set for Apache:

MaxRequestWorkers       200
ServerLimit             200
ThreadsPerChild         25

You will need to read the documentation to really understand what this means, as it's not intuitive, but here is a quick summary:

  • MaxRequestWorkers is the max allowed workers to handle requests
  • ServerLimit is the max number of Apache processes/servers that can be spawned
  • ThreadsPerChild is the max number of threads each process/server can have

With that said for our scenario:

200 servers * 25 ThreadsPerChild = 5000 threads to handle requests

However, since we have MaxRequestWorkers set to 200, and each server/process can have up to 25 threads, then only 8 servers/processes will ever be spawned. We will never even get close to the 200 ServerLimit set:

200 MaxRequestWorkers / 25 ThreadsPerChild = 8 Servers

Also from above knowing that since we never set max in the proxy part of the configuration, according to the documentation max will then be set to ThreadsPerChild which is 25.

So with all of that said we can calculate that the maximum number of proxied requests that Apache may try to create for PHP-FPM is 200:

8 Servers * 25 ThreadsPerChild = 200 Potential Proxied Requests issued

200! We only have 30 PHP-FPM workers to handle these proxy requests, this will eventually result in the problem above where you will find in your error log the following:

[Thu Dec 08 04:22:06.264423 2022] [proxy_fcgi:error] [pid 1916665:tid 1916987] (70007)The timeout specified has expired: [client 185.220.101.47:9248] AH01075: Error dispatching request to :443: (polling)

In our original setup we had ProxySet timeout=240, which means when someone visits your website that uses one of these broken proxied connections, it will spin for 4 minutes (240 seconds) before showing the Gateway Timeout error.

We have to make sure that however you configure your variables that the end result is 30 or fewer proxied requests. If Apache even tries to proxy 1 request more than the limit, it will eventually result in this failure. I know this, because I tested it with Apache's Benchmark AB tool:

ab -n 1000 -c 20 http://www.website.com/

While running this it eventually hangs and results in failures and if you examine Apache's logs you will find all sorts of timeouts eventually showing up. As long as you make sure the proxied requests is always less than what is allocated with PHP-FPM, I confirmed that the problem never happens. So with that said here is how the configuration was changed:

MaxRequestWorkers       200
ServerLimit             2
StartServers            2
ThreadsPerChild         100
ThreadLimit             100
MinSpareThreads         100
MaxSpareThreads         200

<FilesMatch \.php$>
    SetHandler "proxy:unix:/run/php-fpm/php.sock|fcgi://php/"
</FilesMatch>

<Proxy "fcgi://php/">
    ProxySet enablereuse=on
    ProxySet max=15
    ProxySet flushpackets=on
    ProxySet connectiontimeout=5
    ProxySet timeout=30
</proxy>

Let's do the math here. With the ServerLimit at 2, and ThreadsPerChild at 100, the maximum number of threads that will ever be created is 200. We also set MaxRequestWorkers to match this number to be consistent.

2 Servers * 100 Threads = 200 Threads Total

Now to figure out how to set the max proxied requests per server/process that will be allowed, remember its simply the number PHP-FPM process available divided by the number of Apache servers/processes:

30 PHP Workers / 2 Servers = 15 Max per process

Thus that is why we set ProxySet max=15. In other words, the most Apache servers that will ever be created is 2 and the most proxied request per server is 15 which means that in total the most that can ever be proxied at any given time is 30. Since this is equal to or less than the number or PHP-FPM workers available, the problem is now gone!

I have confirmed this with benchmarking as described above, and this allows you to continue to use persistent connections without running into the dreaded (70007)The timeout specified has expired error.

add a comment
0
Answered
Updated

It appears that the issue you're encountering is related to the timeout settings for the proxy connection between Apache and PHP-FPM. The error message "The timeout specified has expired" indicates that a request was not completed within the specified timeout period.

In your Apache configuration, you have set the connectiontimeout to 5 seconds and the timeout to 240 seconds for the proxy connection. This means that if a request takes longer than 5 seconds to establish a connection to PHP-FPM, or if the request takes longer than 240 seconds to complete, the proxy will timeout and return a 503 error to the client.

It's possible that the issue is caused by a high number of requests or a high server load, which could cause the proxy to timeout even though PHP-FPM is processing the request in a timely manner. You have set MaxRequestWorkers to 200, which is the maximum number of requests that Apache can handle simultaneously. If you have a high number of concurrent visitors to your website, this limit may be reached and cause the proxy to timeout.

You may want to consider increasing the timeout value for the proxy connection to a higher value, like 300 seconds. This will give more time for the request to complete. Additionally, you might want to monitor the server load and number of requests and adjust the MaxRequestWorkers accordingly.

It's also important to check if you have any issue with the network connectivity between Apache and PHP-FPM. Sometimes, network issues can cause delays in the communication between the two. You may want to use network monitoring tools to check if there are any issues with the network connectivity.

Another thing to check is the health of the Redis and MySQL servers, which PHP interacts with. Sometimes, these servers may be overloaded and that can cause delays in processing requests.

As a final note, it is important to keep an eye on the Apache and PHP-FPM logs, as well as any other logs you may have, to check if there are any clues about the problem.

add a comment
0
Answered
Updated

The error "proxy_fcgi:error (70007)The timeout specified has expired" typically occurs when a request to a FastCGI server takes longer than the specified timeout value. To fix this error, you can try the following:

  • Increase the timeout value: In your web server's configuration file, look for the "fastcgi_read_timeout" or "proxy_read_timeout" directive and increase the value to a higher number (e.g. from 60s to 120s).

  • Check your PHP-FPM configuration: Make sure that the "request_terminate_timeout" value in your PHP-FPM configuration file is higher than the "fastcgi_read_timeout" value in your web server's configuration file.

  • Check your codebase: Make sure that your codebase is not performing any long-running operations that could be causing the timeout. Check your database queries, file operations, and external API calls to ensure they are not taking too long.

  • Check your server's resources: Make sure that your server has enough resources (such as CPU, memory, and disk space) to handle the incoming requests.

  • Check for any other errors: Check the error logs of your web server, PHP-FPM, and any other relevant components to see if there are any other errors that may be causing the problem.

It's important to note that this problem may be caused by different factors, so you may have to try different solutions to find the one that works for you.

add a comment
0