[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Scheduler dumps core; problem with sfio?
- To: zmailer@nic.funet.fi
- Subject: Scheduler dumps core; problem with sfio?
- From: "Dawid Kuroczko" <qnex42@gmail.com>
- Date: Fri, 18 Aug 2006 18:21:39 +0200
- DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=oCPBtyn01YK93keglL/EaZZzhD9/GqdYZRnWm9+IAvqPDPA1knsdg2lCxJN7IKfPz3/bwjiBZVrJ0LkdgCG/IaMAEh3otwXhyFdnQb44PUPC4j+QCT4ke0geq5balD0R5Lu/5fpM+a8IuI637ngyUXci8z8a0Ah30wpahRC5aOI=
- Original-Recipient: rfc822;zmailer-log@nic.funet.fi
- Sender: zmailer-owner@nic.funet.fi
Hello.
I'm using a CVS HEAD version of ZMailer, on a Linux/x86 platform.
Recently I was trying to raise the limit of simultatneous TAs
run by scheduler (from aroung 1000 processes to around 3000).
So, I raised the limit of open fds (ulimit -n 32767), and also
increased the number of open files as seen by the kernel:
echo 512000 > /proc/sys/fs/file-max
Then I tried to run scheduler. Scheduler already had a load of
messages to deliver (around 1 milion files), so it started forking
new children, when suddenly... SEGV. 100% reproducable.
Here's the gdb output:
(gdb) bt
#0 0x00000001 in ?? ()
#1 0x08082f20 in procselhost ()
#2 0xbfdec7f0 in ?? ()
#3 0xbfdec770 in ?? ()
#4 0x00000000 in ?? ()
#5 0xbfdec768 in ?? ()
#6 0x0808ad20 in ?? ()
#7 0xbfdec778 in ?? ()
#8 0x0806b54f in sfprintf (f=0x1, form=0x1 <Address 0x1 out of
bounds>) at sfprintf.c:27
#9 0xffffffff in ?? ()
#10 0xffffffff in ?? ()
#11 0xffffffff in ?? ()
#12 0xffffffff in ?? ()
#13 0xffffffff in ?? ()
#14 0xffffffff in ?? ()
#15 0xffffffff in ?? ()
#16 0xffffffff in ?? ()
#17 0xffffffff in ?? ()
#18 0xffffffff in ?? ()
#19 0xffffffff in ?? ()
#20 0xffffffff in ?? ()
#21 0xffffffff in ?? ()
#22 0xffffffff in ?? ()
#23 0x0000ffff in ?? ()
#24 0x00000007 in ?? ()
#25 0x00000006 in ?? ()
#26 0x0804be20 in sig_exit () at scheduler.c:1108
(gdb) f 0
#0 0x00000001 in ?? ()
(gdb) l
1108 die(0, "signal");
1109 mustexit = 1;
1110 }
1111
1112 static RETSIGTYPE sig_quit(sig)
1113 int sig;
1114 {
1115 slow_shutdown = 1;
1116 freeze = 1;
1117 }
(gdb) f 8
#8 0x0806b54f in sfprintf (f=0x1, form=0x1 <Address 0x1 out of
bounds>) at sfprintf.c:27
27 rv = sfvprintf(f,form,args);
(gdb) l
22 reg char* form;
23 va_start(args);
24 f = va_arg(args,Sfio_t*);
25 form = va_arg(args,char*);
26 #endif
27 rv = sfvprintf(f,form,args);
28
29 va_end(args);
30 return rv;
31 }
(gdb) f 26
#26 0x0804be20 in sig_exit () at scheduler.c:1108
1108 die(0, "signal");
(gdb) l
1103 if (querysocket6 >= 0) { /* give up
mailq socket asap */
1104 close(querysocket6);
1105 querysocket6 = -1;
1106 }
1107 if (canexit)
1108 die(0, "signal");
1109 mustexit = 1;
1110 }
1111
1112 static RETSIGTYPE sig_quit(sig)
# /usr/local/zmailer/bin/scheduler -V
ZMailer scheduler (2.99.57.pre4 #1: Thu Aug 3 22:36:38 CEST 2006)
root@localhost:/root/qnex/zmailer/zmailer/scheduler
Copyright 1992 Rayan S. Zachariassen
Copyright 1992-2004 Matti Aarnio
Configured with command: 'CC='gcc' CFLAGS='-g -O2' ./configure
'--with-openssl' '--with-ta-mmap' '--prefix=/usr/local/zmailer''
# file /usr/local/zmailer/bin/scheduler
/usr/local/zmailer/bin/scheduler: ELF 32-bit LSB executable, Intel
80386, version 1 (SYSV), for GNU/Linux 2.2.0, dynamically linked (uses
shared libs), not stripped
To me it seems like there could be a bug withing superfast IO which makes
it unable to handle too many simultaneously open files...
Any idea how to fix it? :)
Regards,
Dawid
-
To unsubscribe from this list: send the line "unsubscribe zmailer" in
the body of a message to majordomo@nic.funet.fi