ПРОЕКТЫ 


  АРХИВ 


Apache-Talk @lexa.ru 

Inet-Admins @info.east.ru 

Filmscanners @halftone.co.uk 

Security-alerts @yandex-team.ru 

nginx-ru @sysoev.ru 


  СТАТЬИ 


  ПЕРСОНАЛЬНОЕ 


  ПРОГРАММЫ 



ПИШИТЕ
ПИСЬМА












     АРХИВ :: nginx-ru
Nginx-ru mailing list archive (nginx-ru@sysoev.ru)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: upstream prematurely closed connection


  • To: nginx-ru@xxxxxxxxx
  • Subject: Re: upstream prematurely closed connection
  • From: BaHbka <bahbka@xxxxxxxxx>
  • Date: Fri, 18 Jul 2008 09:28:16 +0400
  • Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=YLdQO3oye2yQrm4HhLMWRILzm34nKF1w6ail5qD7jxg=; b=plR+A8CwQIGBgZfhyStNINpVINUj48Z+NPldjJJ0cljdR+I+aJC9f6GzI9wsJSN38/ ICjAXYtPjH56/MqQKrq37s0OFu44fZUYQ1Osjm9wnkuwiOEaF97hh4+2CmwdGQHQr9Px houkGQTB0qNy3XZo0XIeNsrRcDWfOf+NumLqg=
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=Z4mjzrMmJ2R3Xs2yD6yDZJpI0Rxjq2h4XM+buzuzLhWvv5d2bPizg/QbrRc7FZc2cE VFJt/GPDTwjCK8k/c49wZCchuvZzbPbo7MDqcGEjcV9nu0NKt/u1eynJeyyo8kQb8TZz gTokMRIjESKRyeshn8tIaKti5nPdxFYjKyyE4=
  • In-reply-to: <487FA12A.7020803@xxxxxxxxxxx>
  • References: <487F88D8.9040802@xxxxxxxxx> <487FA12A.7020803@xxxxxxxxxxx>

DHCP
Да, кстати, в логах php-fpm, в логах на бэкендах и на фронтенде ничего 
подозрительного нет.
Еще раз повторюсь, что соедниения обрываются со всеми бэкендами _одновременно_.

Вот на графике http://img91.imageshack.us/img91/3523/prematurelyclosedes9.png 
кол-во таких строчек в логе nginx'а за 5 минут.

Valery Kholodkov пишет:
На бездисковых бакэндах IP статический или DHCP?

BaHbka wrote:
Здравствуйте.

Имеются 1 фронтэнд, 4 бэкенда и 2 сервера mysql
Никак не могу справиться с одной проблемой: периодически в логе ngnix'а появляются строки следующего вида

2008/07/17 14:34:40 [error] 14415#0: *3460073 upstream prematurely closed connection while reading response header from upstream, ... upstream: "fastcgi://192.168.0.1:9000" 2008/07/17 14:34:40 [error] 14415#0: *3460070 upstream prematurely closed connection while reading response header from upstream, ... upstream: "fastcgi://192.168.0.3:9000" 2008/07/17 14:34:40 [error] 14415#0: *3460069 upstream prematurely closed connection while reading response header from upstream, ... upstream: "fastcgi://192.168.0.4:9000" 2008/07/17 14:34:40 [error] 14410#0: *3460247 upstream prematurely closed connection while reading response header from upstream, ... upstream: "fastcgi://192.168.0.4:9000"

после чего зачастую проскакивает no live upstreams while connecting to upstream

иногда бывают такие:

2008/07/17 14:54:21 [error] 14412#0: *3760898 kevent() reported about an closed connection (54: Connection reset by peer) while reading response header from upstream, ... upstream: "fastcgi://192.168.0.2:9000" 2008/07/17 14:54:21 [error] 14412#0: *3760900 kevent() reported about an closed connection (54: Connection reset by peer) while reading response header from upstream, ... upstream: "fastcgi://192.168.0.3:9000" 2008/07/17 14:54:21 [error] 14412#0: *3746025 kevent() reported about an closed connection (54: Connection reset by peer) while reading response header from upstream, ... upstream: "fastcgi://192.168.0.3:9000"

Причем появляются эти строчки пачками в одну секунду штук по 5-50 с периодичностью раз в 5-10 минут (в зависимости от нагрузки), в основном со всех бэкендов.
Утром например, когда кол-во посетителей небольшое этого не происходит.
Причем в реквесте часто бывают скрипты, которые вообще не используют mysql. В одном из таких тестовых скриптов вообще пхпшного кода нет (скрипт из одной строчки вида <input id=asd value=1>) :) Но бэкенд все равно разрывает соединение. Я думаю (но могу ошибаться) mysql тут ни при чем, а раз в одну секунду разрывается соединение со всеми бэками, то и они ни при чем. Подозреваю, что клиенты в этот момент получают 502-ю, т.к. сам бывало на нее нарывался.

В дебаг логе nginx'а следующее (не уверен, все ли что нужно захватил)

2008/07/16 16:17:52 [debug] 35120#0: *405143 http init upstream, client timer: 0 2008/07/16 16:17:52 [debug] 35120#0: *405143 kevent set event: 534: ft:-2 fl:0025
2008/07/16 16:17:52 [debug] 35120#0: *405143 http script copy: ""
...
2008/07/16 16:17:52 [debug] 35120#0: *405143 fastcgi param: "X_FORWARDED_FOR: 123.123.123.123"
2008/07/16 16:17:52 [debug] 35120#0: *405143 http script copy: ""
2008/07/16 16:17:52 [debug] 35120#0: *405143 fastcgi param: "SSL_CLIENT_SERIAL: 0" 2008/07/16 16:17:52 [debug] 35120#0: *405143 http cleanup add: 00000000006EDF40
2008/07/16 16:17:52 [debug] 35120#0: *405143 get rr peer, try: 4
2008/07/16 16:17:52 [debug] 35120#0: *405143 get rr peer, current: 0 2
2008/07/16 16:17:52 [debug] 35120#0: *405143 socket 237
2008/07/16 16:17:52 [debug] 35120#0: *405143 connect to 192.168.0.2:9000, fd:237 #405294 2008/07/16 16:17:52 [debug] 35120#0: *405143 kevent set event: 237: ft:-1 fl:0025 2008/07/16 16:17:52 [debug] 35120#0: *405143 kevent set event: 237: ft:-2 fl:0025
2008/07/16 16:17:52 [debug] 35120#0: *405143 http upstream connect: -2
2008/07/16 16:17:52 [debug] 35120#0: *405143 event timer add: 237: 75000:1216210747795 2008/07/16 16:17:58 [debug] 35120#0: *405143 http upstream check client, write event:1, "/blablabla.php"
2008/07/16 16:17:58 [debug] 35120#0: *405143 http upstream process header
2008/07/16 16:17:58 [debug] 35120#0: *405143 malloc: 000000000110E000:32768
2008/07/16 16:17:58 [debug] 35120#0: *405143 recv: eof:1, avail:0, err:0
2008/07/16 16:17:58 [error] 35120#0: *405143 upstream prematurely closed connection while reading response header from upstream, .....
2008/07/16 16:17:58 [debug] 35120#0: *405143 http next upstream, 2
2008/07/16 16:17:58 [debug] 35120#0: *405143 free rr peer 4 4
2008/07/16 16:17:58 [debug] 35120#0: *405143 free rr peer failed: 0 -5
2008/07/16 16:17:58 [debug] 35120#0: *405143 close http upstream connection: 237 2008/07/16 16:17:58 [debug] 35120#0: *405143 event timer del: 237: 1216210747795
2008/07/16 16:17:58 [debug] 35120#0: *405143 get rr peer, try: 3
2008/07/16 16:17:58 [debug] 35120#0: *405143 socket 140
2008/07/16 16:17:58 [debug] 35120#0: *405143 connect to 192.168.0.3:9000, fd:140 #406707 2008/07/16 16:17:58 [debug] 35120#0: *405143 kevent set event: 140: ft:-1 fl:0025 2008/07/16 16:17:58 [debug] 35120#0: *405143 kevent set event: 140: ft:-2 fl:0025
2008/07/16 16:17:58 [debug] 35120#0: *405143 http upstream connect: -2

или вот, например

2008/07/16 16:17:49 [debug] 35117#0: *404981 connect to 192.168.0.3:9000, fd:560 #405020 2008/07/16 16:17:49 [debug] 35117#0: *404981 kevent set event: 560: ft:-1 fl:0025 2008/07/16 16:17:49 [debug] 35117#0: *404981 kevent set event: 560: ft:-2 fl:0025
2008/07/16 16:17:49 [debug] 35117#0: *404981 http upstream connect: -2
2008/07/16 16:17:49 [debug] 35117#0: *404981 event timer add: 560: 75000:1216210744998
2008/07/16 16:17:59 [debug] 35117#0: *404981 post event 0000000000DA58A0
2008/07/16 16:17:59 [debug] 35117#0: *404981 post event 0000000000AFAA20
2008/07/16 16:17:59 [debug] 35117#0: *404981 post event 0000000000DA6A20
2008/07/16 16:17:59 [debug] 35117#0: *404981 delete posted event 0000000000DA6A20 2008/07/16 16:17:59 [debug] 35117#0: *404981 http upstream send request handler
2008/07/16 16:17:59 [debug] 35117#0: *404981 http upstream send request
2008/07/16 16:17:59 [debug] 35117#0: *404981 chain writer buf fl:0 s:1320
2008/07/16 16:17:59 [debug] 35117#0: *404981 chain writer in: 000000000106EF68
2008/07/16 16:17:59 [debug] 35117#0: *404981 writev: 1320 of 1320
2008/07/16 16:17:59 [debug] 35117#0: *404981 chain writer out: 0000000000000000 2008/07/16 16:17:59 [debug] 35117#0: *404981 event timer del: 560: 1216210744998 2008/07/16 16:17:59 [debug] 35117#0: *404981 event timer add: 560: 180000:1216210859644
2008/07/16 16:17:59 [debug] 35117#0: *404981 http upstream process header
2008/07/16 16:17:59 [debug] 35117#0: *404981 malloc: 00000000006E9000:32768
2008/07/16 16:17:59 [debug] 35117#0: *404981 recv: eof:1, avail:0, err:0
2008/07/16 16:17:59 [error] 35117#0: *404981 upstream prematurely closed connection while reading response header from upstream, .....
2008/07/16 16:17:59 [debug] 35117#0: *404981 http next upstream, 2
2008/07/16 16:17:59 [debug] 35117#0: *404981 free rr peer 4 4
2008/07/16 16:17:59 [debug] 35117#0: *404981 free rr peer failed: 2 -11
2008/07/16 16:17:59 [debug] 35117#0: *404981 close http upstream connection: 560

ktrace'ом бэкендов ничего не смог выловить. Видимо просто не умею его готовить :)

Схема следующая:
---
один фронтэнд - FreeBSD 6.2-RELEASE-p7 amd64
nginx 0.6.32 (до этого был nginx 0.5.34 - ситуация такая же) ставился из портов

netstat -Lan
tcp4  0/0/4096       123.123.123.123.80

netstat -m
6652/9503/16155 mbufs in use (current/cache/total)
2249/7867/10116/32768 mbuf clusters in use (current/cache/total/max)
2231/6351 mbuf+clusters out of packet secondary zone in use (current/cache)
6161K/18109K/24270K bytes allocated to network (current/cache/total)

ngnix.conf:

user                    www www;
worker_processes        8;

...

events {
        worker_connections  25000;
        use kqueue;
}

http {
...
        client_header_timeout           1m;
        client_body_timeout             1m;
        send_timeout                    1m;

        client_header_buffer_size       32k;
        large_client_header_buffers     16 16k;

        gzip                            on;
        gzip_min_length                 1100;
        gzip_buffers                    4 8k;
gzip_types text/plain text/css application/x-javascript;

        output_buffers                  1 32k;
        postpone_output                 1460;

        sendfile                        on;
        sendfile_max_chunk              128k;
        tcp_nopush                      on;
        tcp_nodelay                     on;
        send_lowat                      12000;

        keepalive_timeout               75 20;

        fastcgi_hide_header             x-powered-by;
        fastcgi_ignore_client_abort     off;
        fastcgi_next_upstream           error;

        upstream backend
        {
                server  192.168.0.1:9000        weight=10;
                server  192.168.0.2:9000        weight=10;
                server  192.168.0.3:9000        weight=10;
                server  192.168.0.4:9000        weight=10;
        }

        server {

                listen                  123.123.123.123:80;
                server_name             www.***.ru ***.ru;
error_log /var/log/nginx/nginx-error.log error;

                client_body_buffer_size 1024k;
                client_max_body_size    4m;

...

               location ~ \.php$
                {
                        access_log                              off;
error_log /var/log/nginx/php-error.log error;
                        fastcgi_pass                            backend;
fastcgi_index index.php;
                        fastcgi_connect_timeout                 75;
                        fastcgi_send_timeout                    180;
                        fastcgi_read_timeout                    180;
                        fastcgi_buffer_size                     32k;
                        fastcgi_buffers                         8 256k;
                        fastcgi_busy_buffers_size               256k;
                        fastcgi_temp_file_write_size            256k;
fastcgi_param SCRIPT_FILENAME /var/www/www.***.ru/php/$fastcgi_script_name;
...

---
четыре бездисковых бэкенда FreeBSD 7.0-RELEASE-p1 amd64 (раньше было FreeBSD 6.2, php 5.2.4 без php-fpm патча - такая же ситуация)
php 5.2.6 + php-fpm 0.5.8-rc1

netstat -Lan
tcp4  0/0/4096       *.9000

netstat -m
262/3578/3840 mbufs in use (current/cache/total)
260/3624/3884/25600 mbuf clusters in use (current/cache/total/max)
260/2812 mbuf+clusters out of packet secondary zone in use (current/cache)
585K/9030K/9616K bytes allocated to
 network (current/cache/total)

php-fpm.conf:
...
                <section name="pool">
                        <value name="name">first</value>
                        <value name="listen_address">0.0.0.0:9000</value>
...
                                <value name="max_children">100</value>
...
<value name="request_execution_timeout">601s</value>
                        <value name="rlimit_files">2048</value>
                        <value name="rlimit_core">0</value>
...
                        <value name="max_requests">500</value>
...

Где копать? На фронтэнде или все таки бэкенды выеживаются?

Заранее благодарю :)








 




Copyright © Lexa Software, 1996-2009.