WebFaction
Community site: login faq

I'm running a WordPress site using FastCGI and every week or so the number of fcgi processes goes from the usual 7 to something approaching 30 and everything ends up being killed by WebFaction because the memory usage has exceeded the maximum 1GB.

The .htaccess file has the following at the top:

<FilesMatch \.php$>
    SetHandler php56-fcgi6
</FilesMatch>

Here's an normal ps listing:

  PID   RSS                  STARTED   TIME COMMAND
  788  2176 Tue Jul 17 16:53:13 2018   0:00 -bash
 1445  1844 Tue Jul 17 16:53:54 2018   0:00 ps -u username -o pid,rss,lstart,bsdtime,command
 1446   972 Tue Jul 17 16:53:54 2018   0:00 awk {print $0}{sum+=$2} END {print "Total", sum/1024, "MB"}
14551 16388 Tue Jul 17 15:56:48 2018   0:00 /home/php-cgi/php56.fcgi
14552 77372 Tue Jul 17 15:56:48 2018   4:14 /home/php-cgi/php56.fcgi
14553 79780 Tue Jul 17 15:56:48 2018   4:14 /home/php-cgi/php56.fcgi
14554 80016 Tue Jul 17 15:56:48 2018   4:13 /home/php-cgi/php56.fcgi
14555 81012 Tue Jul 17 15:56:48 2018   4:20 /home/php-cgi/php56.fcgi
14556 77048 Tue Jul 17 15:56:48 2018   4:17 /home/php-cgi/php56.fcgi
14557 78868 Tue Jul 17 15:56:48 2018   4:22 /home/php-cgi/php56.fcgi
34635  2172 Tue Jul 17 15:36:04 2018   0:00 -bash
36505  2140 Tue Jul 17 15:37:45 2018   0:00 -bash
Total 488.074 MB

And here's an example of when it has gone wrong:

  PID   RSS                  STARTED   TIME COMMAND
  689 76352 Sun Jul  8 20:56:41 2018   0:04 /home/php-cgi/php56.fcgi
  693 33228 Sun Jul  8 20:56:41 2018   0:02 /home/php-cgi/php56.fcgi
  694 70368 Sun Jul  8 20:56:41 2018   0:03 /home/php-cgi/php56.fcgi
 8596 41760 Sun Jul  8 21:01:47 2018   0:02 /home/php-cgi/php56.fcgi
 8598 35208 Sun Jul  8 21:01:47 2018   0:02 /home/php-cgi/php56.fcgi
 8599 41812 Sun Jul  8 21:01:47 2018   0:02 /home/php-cgi/php56.fcgi
 8601 76796 Sun Jul  8 21:01:47 2018   0:02 /home/php-cgi/php56.fcgi
 8602 76712 Sun Jul  8 21:01:47 2018   0:01 /home/php-cgi/php56.fcgi
 8604 77308 Sun Jul  8 21:01:47 2018   0:01 /home/php-cgi/php56.fcgi
 8608 41448 Sun Jul  8 21:01:47 2018   0:01 /home/php-cgi/php56.fcgi
 8609 40768 Sun Jul  8 21:01:47 2018   0:03 /home/php-cgi/php56.fcgi
 8611 41520 Sun Jul  8 21:01:47 2018   0:02 /home/php-cgi/php56.fcgi
 8613 69880 Sun Jul  8 21:01:47 2018   0:02 /home/php-cgi/php56.fcgi
 8614 77040 Sun Jul  8 21:01:47 2018   0:03 /home/php-cgi/php56.fcgi
 8617 77392 Sun Jul  8 21:01:47 2018   0:02 /home/php-cgi/php56.fcgi
 8618 70404 Sun Jul  8 21:01:47 2018   0:01 /home/php-cgi/php56.fcgi
 8621 39952 Sun Jul  8 21:01:47 2018   0:02 /home/php-cgi/php56.fcgi
 8623 74988 Sun Jul  8 21:01:47 2018   0:01 /home/php-cgi/php56.fcgi
 8625 77004 Sun Jul  8 21:01:47 2018   0:02 /home/php-cgi/php56.fcgi
 9069 16444 Sun Jul  8 03:16:11 2018   0:00 /home/php-cgi/php56.fcgi
 9322  2184 Sun Jul  8 21:02:26 2018   0:00 -bash
11998  1840 Sun Jul  8 21:05:01 2018   0:00 ps -u username -o pid,rss,lstart,bsdtime,command
11999   968 Sun Jul  8 21:05:01 2018   0:00 awk {print $0}{sum+=$2} END {print "Total", sum/1024, "MB"}
15462 16444 Sun Jul  8 20:32:01 2018   0:00 /home/php-cgi/php56.fcgi
15757 16448 Sun Jul  8 20:32:19 2018   0:00 /home/php-cgi/php56.fcgi
16183 16444 Sun Jul  8 20:32:53 2018   0:00 /home/php-cgi/php56.fcgi
16647 16444 Sun Jul  8 20:33:10 2018   0:00 /home/php-cgi/php56.fcgi
35793 40872 Sun Jul  8 20:46:31 2018   0:08 /home/php-cgi/php56.fcgi
35803 33452 Sun Jul  8 20:46:31 2018   0:07 /home/php-cgi/php56.fcgi
35807 78028 Sun Jul  8 20:46:31 2018   0:08 /home/php-cgi/php56.fcgi
44304 34652 Sun Jul  8 20:51:37 2018   0:06 /home/php-cgi/php56.fcgi
44306 75664 Sun Jul  8 20:51:37 2018   0:06 /home/php-cgi/php56.fcgi
44308 80192 Sun Jul  8 20:51:37 2018   0:06 /home/php-cgi/php56.fcgi
44312 77732 Sun Jul  8 20:51:37 2018   0:06 /home/php-cgi/php56.fcgi
44313 50808 Sun Jul  8 20:51:37 2018   0:07 /home/php-cgi/php56.fcgi
44315 48976 Sun Jul  8 20:51:37 2018   0:07 /home/php-cgi/php56.fcgi
44316 87060 Sun Jul  8 20:51:37 2018   0:06 /home/php-cgi/php56.fcgi
44318 77176 Sun Jul  8 20:51:37 2018   0:07 /home/php-cgi/php56.fcgi
Total 1869.46 MB

I thought the SetHandler php56-fcgi6 was supposed to limit the number of FastCGI processes to 6?

Does anyone have any suggestions as to how we can further investigate this, and find out what is causing the explosion in processes?

Thanks in advance!

asked 17 Jul, 16:57

pd2018
112
accept rate: 0%


The thread limit set should limit the number of threads started, unless there is a forking bug or something in your PHP trying to execute a command in shell which is failing. With FCGI you should be using a task queue to manage external threads. It is hard to see if this is the case from the data above. Ideally you would want to use a debugger like strace while the problem is happening to get more info from the stuck threads.

First identify the PID of the process you want,

ps -u $USER -o pid,rss,command

Once you know the PID you can run strace, replace PID with the number from ps,

strace -tTvf -s 1024 -p PID

This will provide verbose information from which you may be able to figure out what is causing the failure.

permanent link

answered 17 Jul, 21:02

johns ♦♦
5.3k212
accept rate: 23%

Thank you for the suggestion, I have incorporated strace into my script that monitors memory usage and this will hopefully give me some more clues.

In the meantime, I have noticed that we have the following crontab entry, I don't know when or why this was added, or indeed if it is part of some standard WebFaction configuration:

13,33,53 * * * * /home/username/webapps/phpstack/apache2/bin/start > /dev/null 2>&1

Is this still needed these days, and what does/did it do?

Thanks!

permanent link

answered 21 Jul, 09:32

pd2018
112
accept rate: 0%

phpstack was an old custom apache and php that was often used on some CentOS 5 servers when customers needed something specific.

You should be able to safely delete that cron job after verifying none of your websites is still using that application.

(21 Jul, 11:54) aaront ♦♦

Update: I'm now logging memory usage every five minutes, and if there are more than the normal number of php56.fcgi processes then I do an strace on all of them to see what they are doing. The normal number of php56.fcgi processes seems to be 7 - one of these is parent to the other six, e.g. as follows:

Date Tue Sep 4 09:35:01 UTC 2018
  PID  PPID   RSS    VSZ                  STARTED   TIME STAT COMMAND
15654 15286 16408 258680 Tue Sep  4 01:38:02 2018   0:00 Ss   /home/php-cgi/php56.fcgi
24155 15654 80328 422080 Tue Sep  4 08:27:24 2018   3:03 S    /home/php-cgi/php56.fcgi
24497 15654 79084 411584 Tue Sep  4 08:27:54 2018   3:07 S    /home/php-cgi/php56.fcgi
25070 15654 80084 411700 Tue Sep  4 08:28:26 2018   3:12 S    /home/php-cgi/php56.fcgi
25099 15654 75844 408056 Tue Sep  4 08:28:31 2018   2:54 S    /home/php-cgi/php56.fcgi
25145 15654 77116 408760 Tue Sep  4 08:28:35 2018   3:06 S    /home/php-cgi/php56.fcgi
30320 15654 80196 412412 Tue Sep  4 08:30:45 2018   3:09 S    /home/php-cgi/php56.fcgi

I can't get info about the parent process (15286), though. I assume this is hidden from me by WebFaction's shared server?

Five minutes later, another seven processes have appeared, so the total number is now 14. At the end of each line above is what the strace says the process is doing - "busy" means it seems to me like it's doing WordPress stuff.

Date Tue Sep 4 09:40:01 UTC 2018
  PID  PPID   RSS    VSZ                  STARTED   TIME STAT COMMAND
15654 15286 16408 258680 Tue Sep  4 01:38:02 2018   0:00 Ss   /home/php-cgi/php56.fcgi  wait4(-1)
16726 15286 16348 258680 Tue Sep  4 09:39:55 2018   0:00 Ss   /home/php-cgi/php56.fcgi  wait4(-1)   
16727 16726  9788 258680 Tue Sep  4 09:39:55 2018   0:00 S    /home/php-cgi/php56.fcgi  accept(0)
16728 16726  7288 258680 Tue Sep  4 09:39:55 2018   0:00 S    /home/php-cgi/php56.fcgi  accept(0)
16729 16726  7288 258680 Tue Sep  4 09:39:55 2018   0:00 S    /home/php-cgi/php56.fcgi  accept(0)
16730 16726  7288 258680 Tue Sep  4 09:39:55 2018   0:00 S    /home/php-cgi/php56.fcgi  busy
16731 16726  7288 258680 Tue Sep  4 09:39:55 2018   0:00 S    /home/php-cgi/php56.fcgi  accept(0)
16732 16726  7288 258680 Tue Sep  4 09:39:55 2018   0:00 S    /home/php-cgi/php56.fcgi  accept(0)
24155 15654 80384 421824 Tue Sep  4 08:27:24 2018   3:16 S    /home/php-cgi/php56.fcgi  accept(0)
24497 15654 78384 410816 Tue Sep  4 08:27:54 2018   3:22 S    /home/php-cgi/php56.fcgi  busy
25070 15654 79044 410852 Tue Sep  4 08:28:26 2018   3:31 S    /home/php-cgi/php56.fcgi  busy
25099 15654 76060 418800 Tue Sep  4 08:28:31 2018   3:08 S    /home/php-cgi/php56.fcgi  accept(0)
25145 15654 76604 408504 Tue Sep  4 08:28:35 2018   3:19 S    /home/php-cgi/php56.fcgi  accept(0)
30320 15654 80196 412412 Tue Sep  4 08:30:45 2018   3:22 S    /home/php-cgi/php56.fcgi  accept(0)

Five minutes later, the number of processes has grown to 21 - notice this is a multiple of 7 - and they are all in either wait4() or accept():

Date Tue Sep 4 09:45:02 UTC 2018
  PID  PPID   RSS    VSZ                  STARTED   TIME STAT COMMAND
15654 15286 16408 258680 Tue Sep  4 01:38:02 2018   0:00 Ss   /home/php-cgi/php56.fcgi  wait4(-1)
16726 15286 16348 258680 Tue Sep  4 09:39:55 2018   0:00 Ss   /home/php-cgi/php56.fcgi  wait4(-1)
16727 16726 76740 420232 Tue Sep  4 09:39:55 2018   0:05 S    /home/php-cgi/php56.fcgi  accept(0)
16728 16726 76024 408572 Tue Sep  4 09:39:55 2018   0:03 S    /home/php-cgi/php56.fcgi  accept(0)
16729 16726 77396 410184 Tue Sep  4 09:39:55 2018   0:05 S    /home/php-cgi/php56.fcgi  accept(0)
16730 16726 77096 410196 Tue Sep  4 09:39:55 2018   0:06 S    /home/php-cgi/php56.fcgi  accept(0)
16731 16726 75360 418480 Tue Sep  4 09:39:55 2018   0:05 S    /home/php-cgi/php56.fcgi  accept(0)
16732 16726 76340 409252 Tue Sep  4 09:39:55 2018   0:02 S    /home/php-cgi/php56.fcgi  accept(0)
20943 15286 16352 258680 Tue Sep  4 09:41:09 2018   0:00 Ss   /home/php-cgi/php56.fcgi  wait4(-1)
20944 20943 75868 419128 Tue Sep  4 09:41:09 2018   0:02 S    /home/php-cgi/php56.fcgi  accept(0)
20945 20943 57388 389892 Tue Sep  4 09:41:09 2018   0:04 S    /home/php-cgi/php56.fcgi  accept(0)
20946 20943 76052 408788 Tue Sep  4 09:41:09 2018   0:03 S    /home/php-cgi/php56.fcgi  accept(0)
20947 20943 52352 298732 Tue Sep  4 09:41:09 2018   0:03 S    /home/php-cgi/php56.fcgi  accept(0)
20948 20943 68764 315100 Tue Sep  4 09:41:09 2018   0:02 S    /home/php-cgi/php56.fcgi  accept(0)
20949 20943 75508 408104 Tue Sep  4 09:41:09 2018   0:03 S    /home/php-cgi/php56.fcgi  accept(0)
24155 15654 79904 421824 Tue Sep  4 08:27:24 2018   3:24 S    /home/php-cgi/php56.fcgi  accept(0)
24497 15654 78384 410816 Tue Sep  4 08:27:54 2018   3:27 S    /home/php-cgi/php56.fcgi  accept(0)
25070 15654 79756 411364 Tue Sep  4 08:28:26 2018   3:36 S    /home/php-cgi/php56.fcgi  accept(0)
25099 15654 76060 418800 Tue Sep  4 08:28:31 2018   3:13 S    /home/php-cgi/php56.fcgi  accept(0)
25145 15654 77552 409292 Tue Sep  4 08:28:35 2018   3:24 S    /home/php-cgi/php56.fcgi  accept(0)
30320 15654 80416 423156 Tue Sep  4 08:30:45 2018   3:30 S    /home/php-cgi/php56.fcgi  accept(0)

Five minutes later, 7 of the processes have disappeared (it is the most recent 7 that have gone):

Date Tue Sep 4 09:50:01 UTC 2018
  PID  PPID   RSS    VSZ                  STARTED   TIME STAT COMMAND
15654 15286 16408 258680 Tue Sep  4 01:38:02 2018   0:00 Ss   /home/php-cgi/php56.fcgi  wait4(-1)
16726 15286 16348 258680 Tue Sep  4 09:39:55 2018   0:00 Ss   /home/php-cgi/php56.fcgi  wait4(-1)
16727 16726 76708 419976 Tue Sep  4 09:39:55 2018   0:12 S    /home/php-cgi/php56.fcgi  accept(0)
16728 16726 76168 408848 Tue Sep  4 09:39:55 2018   0:10 S    /home/php-cgi/php56.fcgi  accept(0)
16729 16726 77884 410440 Tue Sep  4 09:39:55 2018   0:10 S    /home/php-cgi/php56.fcgi  accept(0)
16730 16726 77576 410196 Tue Sep  4 09:39:55 2018   0:11 S    /home/php-cgi/php56.fcgi  accept(0)
16731 16726 75648 418736 Tue Sep  4 09:39:55 2018   0:11 S    /home/php-cgi/php56.fcgi  accept(0)
16732 16726 76468 409252 Tue Sep  4 09:39:55 2018   0:07 S    /home/php-cgi/php56.fcgi  accept(0)
24155 15654 80532 422100 Tue Sep  4 08:27:24 2018   3:29 S    /home/php-cgi/php56.fcgi  accept(0)
24497 15654 79032 411344 Tue Sep  4 08:27:54 2018   3:33 S    /home/php-cgi/php56.fcgi  busy
25070 15654 79756 411364 Tue Sep  4 08:28:26 2018   3:42 S    /home/php-cgi/php56.fcgi  accept(0)
25099 15654 77488 420592 Tue Sep  4 08:28:31 2018   3:18 S    /home/php-cgi/php56.fcgi  accept(0)
25145 15654 76424 407992 Tue Sep  4 08:28:35 2018   3:27 S    /home/php-cgi/php56.fcgi  accept(0)
30320 15654 80416 423156 Tue Sep  4 08:30:45 2018   3:35 S    /home/php-cgi/php56.fcgi  accept(0)

And another five minutes later we are back to the original 7 processes:

Date Tue Sep 4 09:55:01 UTC 2018
  PID  PPID   RSS    VSZ                  STARTED   TIME STAT COMMAND
15654 15286 16408 258680 Tue Sep  4 01:38:02 2018   0:00 Ss   /home/php-cgi/php56.fcgi
24155 15654 80532 422100 Tue Sep  4 08:27:24 2018   3:40 S    /home/php-cgi/php56.fcgi
24497 15654 78544 411072 Tue Sep  4 08:27:54 2018   3:45 S    /home/php-cgi/php56.fcgi
25070 15654 79756 411364 Tue Sep  4 08:28:26 2018   3:53 S    /home/php-cgi/php56.fcgi
25099 15654 76588 419568 Tue Sep  4 08:28:31 2018   3:28 S    /home/php-cgi/php56.fcgi
25145 15654 77192 408760 Tue Sep  4 08:28:35 2018   3:39 S    /home/php-cgi/php56.fcgi
30320 15654 80416 423156 Tue Sep  4 08:30:45 2018   3:46 S    /home/php-cgi/php56.fcgi

This pattern repeats itself several times a day, and the total number of fcgi process is always a multiple of 7. Does this give any clue as to what could be going wrong? Does anyone know exactly how these fcgi processes are created and managed?

In another incident later that day, we managed to get up to a total of 70 fcgi processes, which exceeded the 1GB memory limit:

Tue Sep 4 20:45:01 UTC 2018:  7 fcgi     memory usage: 239MB
Tue Sep 4 20:50:01 UTC 2018: 14 fcgi     memory usage: 1046MB 
Tue Sep 4 20:55:01 UTC 2018: 70 fcgi     memory usage: 1605MB

At this point all 70 fcgi processes were in wait4() or accept(), with the majority being in accept()

Tue Sep 4 21:00:01 UTC 2018: 70 fcgi     memory usage: 1711MB

Here most are in wait4() or accept(), a couple seem to be doing WordPress stuff.

Tue Sep 4 21:05:01 UTC 2018:  70 fcgi    memory usage: 1836MB
Tue Sep 4 21:10:02 UTC 2018:  63 fcgi    memory usage: 1692MB
Tue Sep 4 21:15:01 UTC 2018:  56 fcgi    memory usage: 1521MB
Tue Sep 4 21:20:01 UTC 2018:  49 fcgi    memory usage: 1530MB
Tue Sep 4 21:25:01 UTC 2018:  70 fcgi    memory usage: 1946MB
Tue Sep 4 21:30:01 UTC 2018:  63 fcgi    memory usage: 1896MB
Tue Sep 4 21:35:01 UTC 2018:  56 fcgi    memory usage: 1709MB
Tue Sep 4 21:40:01 UTC 2018:  56 fcgi    memory usage: 1738MB
Tue Sep 4 21:45:01 UTC 2018:  49 fcgi    memory usage: 1664MB
Tue Sep 4 21:50:01 UTC 2018:  42 fcgi    memory usage: 1592MB
Tue Sep 4 21:55:01 UTC 2018:  70 fcgi    memory usage: 2495MB
Tue Sep 4 22:00:01 UTC 2018:  70 fcgi    memory usage: 2548MB
Tue Sep 4 22:05:01 UTC 2018:  63 fcgi    memory usage: 2145MB
Tue Sep 4 22:10:01 UTC 2018:  56 fcgi    memory usage: 1950MB
Tue Sep 4 22:15:01 UTC 2018:  49 fcgi    memory usage: 1763MB
Tue Sep 4 22:20:02 UTC 2018:  42 fcgi    memory usage: 1573MB
Tue Sep 4 22:25:01 UTC 2018:  35 fcgi    memory usage: 1305MB
Tue Sep 4 22:30:01 UTC 2018:  28 fcgi    memory usage: 1027MB
Tue Sep 4 22:35:01 UTC 2018:  21 fcgi    memory usage: 789MB
Tue Sep 4 22:40:01 UTC 2018:  28 fcgi    memory usage: 399MB
Tue Sep 4 22:45:02 UTC 2018:  21 fcgi    memory usage: 406MB
Tue Sep 4 22:50:01 UTC 2018:  14 fcgi    memory usage: 404MB
Tue Sep 4 22:55:01 UTC 2018:   7 fcgi    memory usage: 178MB

Any ideas what could explain this behaviour? Is it even worth using FastCGI anyway? Are other people using it? Should we just disable FastCGI and go back to whatever the default PHP setup is?

Thanks in advance, suggestions welcome!

permanent link

answered 06 Sep, 00:53

pd2018
112
accept rate: 0%

Hi, just wondering if anyone has any advice on how I can further investigate this? Does anyone know how the FastCGI mechanism is controlled, and what could be going wrong in this case? I can't seem to find anything useful in the WebFaction documentation about this. How can I debug this?

Another option is to disable FastCGI and go back to the default config. We switched to FastCGI about five years ago because we were finding that we'd regularly run of out memory (and get a warning from WebFaction) due to too many CGI processes being created (as I understood one CGI process is created for each incoming request). The idea of FastCGI was to limit the number of processes being created. Nowadays we have a much higher 1GB memory limit though, so maybe disabling FastCGI now is a practical option.

A final option is just to ignore the fact that we exceeded our memory limit every week or so, and allow WebFaction to kill the processes as and when. I think it would be far better to understand exactly what is going on, though.

permanent link

answered 26 Sep, 08:21

pd2018
112
accept rate: 0%

Switch from FastCGI to regular CGI and then use strace, you should get more feedback.

I have some other tips on this post.

permanent link

answered 26 Sep, 21:06

johns ♦♦
5.3k212
accept rate: 23%

Thanks! Will try that and see how we get on.

(28 Sep, 09:08) pd2018
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Question tags:

×242
×86
×10
×2

question asked: 17 Jul, 16:57

question was seen: 439 times

last updated: 28 Sep, 09:15

WEBFACTION
REACH US
SUPPORT
AFFILIATE PROGRAM
LEGAL
© COPYRIGHT 2003-2016 SWARMA LIMITED - WEBFACTION IS A SERVICE OF SWARMA LIMITED
REGISTERED IN ENGLAND AND WALES 5729350 - VAT REGISTRATION NUMBER 877397162
5TH FLOOR, THE OLD VINYL FACTORY, HAYES, UB3 1HA, UNITED KINGDOM