Consistent Solaris crashes

zellster

Well-Known Member
#1
Hello,

I am running lsws 1.5.7 Standard Edition on a Solaris 5.8 SPARC box (uname info: SunOS mach_name 5.8 Generic_108528-23 sun4u sparc SUNW,UltraAX-i2).

I am consistently seeing the following problem in the lsws process:

Received signal #14, SIGALRM, in poll() [caught]
poll(0x00229098, 5, 1000) Err#4 EINTR
setcontext(0xFFBEF728)
time() = 1094165817
poll(0x00229098, 5, 1000) = 1
accept(9, 0x001F0AFC, 0xFFBEF934, 1) = 12
accept(9, 0x001F0B1C, 0xFFBEF934, 1) Err#11 EAGAIN
fcntl(12, F_SETFD, 0x00000001) = 0
fstat64(12, 0xFFBEF750) = 0
getsockopt(12, 65535, 8192, 0xFFBEF850, 0xFFBEF84C, 2198285) = 0
setsockopt(12, 65535, 8192, 0xFFBEF850, 4, 2198285) = 0
fcntl(12, F_SETFL, 0x00000080) = 0
poll(0x00229098, 6, 1000) = 1
read(12, " G E T / l i b / f p b".., 2044) = 341
Incurred fault #5, FLTACCESS %pc = 0x00059FE4
siginfo: SIGBUS BUS_ADRALN addr=0x00218E03
Received signal #10, SIGBUS [default]
siginfo: SIGBUS BUS_ADRALN addr=0x00218E03
*** process killed ***

Any ideas what this could be? Any wonkish code performing word comparisons on unaligned data?
 

mistwang

LiteSpeed Staff
#2
Thank you for the bug report.

Yes, looks like it is. SPARC is picky about alignments. ;-)

Can you please send us the core file or the back trace of call stack? We have not been able to reproduce it in our lab.

Core file should be under /tmp/lshttpd/ directory.
You may have to use "coreadm" command to enable core dump for a setuid process, or just start the server process with the non-privilege user specified during installation.

Thank you very much!
George Wang
 

zellster

Well-Known Member
#3
Hello,

Here's a stack trace. I will attach the core file in the next post.

# dbx /usr/local/lsws/bin/lshttpd.1.5.7 core
For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message 7.1' in your .dbxrc
Reading lshttpd.1.5.7
core file header read successfully
Reading ld.so.1
Reading libsocket.so.1
Reading libnsl.so.1
Reading librt.so.1
Reading libm.so.1
Reading libc.so.1
Reading libresolv.so.2
Reading libdl.so.1
Reading libmp.so.2
Reading libaio.so.1
Reading libc_psr.so.1
Reading nss_files.so.1
program terminated by signal BUS (invalid address alignment)
Symbol *0x83f0d0
dbx: duplicate type definition (0,1), assuming (int {assumed}), sclass 28: /usr/local/lsws/bin/lshttpd.1.5.7:/home/gwang/crossrel/release/httpd/httpd/main.cpp stab #30 int:t(0,1)=r(0,1);0020000000000;0017777777777;
(dbx) where
=>[1] newKeyValueBuf__7HttpReqRi(0x2164c4, 0x2165b8, 0x7e7e7e00, 0x76696100, 0x76696000, 0x74000000), at 0x59fe4
[2] newUnknownHeader__7HttpReq(0x2164c4, 0x216bd9, 0x59c00, 0xa, 0x216bd2, 0xa), at 0x5e9f8
[3] processHeaderLines__7HttpReq(0x2164c4, 0x216bd2, 0x216c8f, 0x3a, 0x21cd4, 0x4e4e0), at 0x59064
[4] processRequestLine__7HttpReq(0x2164c4, 0x0, 0x216ba4, 0x216c8f, 0x0, 0x0), at 0x58bc4
[5] processHeader__7HttpReq(0x2164c4, 0x2164c4, 0x7fc, 0x0, 0x0, 0x0), at 0x579cc
[6] readToHeaderBuf__14HttpConnection(0x21648c, 0x1, 0x1, 0x0, 0x0, 0x0), at 0x60598
[7] onReadEx__14HttpConnection(0x21648c, 0x61c14, 0x0, 0x0, 0x0, 0x0), at 0x61d44
[8] onRead__10HttpIOLinkP10HttpIOLink(0x21648c, 0x4df24, 0x0, 0x0, 0x0, 0x0), at 0x4df50
[9] handleEvents__10HttpIOLinks(0x21648c, 0x1, 0x4cecc, 0x0, 0x0, 0x0), at 0x4d02c
[10] processAllEvents__13PollfdReactor(0x205910, 0x205910, 0x3e8, 0x0, 0x20c44, 0x918f4), at 0x91d9c
[11] waitAndProcessEvents__6Polleri(0x205908, 0x3e8, 0x918b8, 0x0, 0x0, 0x0), at 0x9192c
[12] run__15EventDispatcher(0x227444, 0x227444, 0x0, 0x0, 0x0, 0x0), at 0x46644
[13] start__14HttpServerImpl(0x22742c, 0x178400, 0x179000, 0xffffffff, 0xfffffff8, 0x227d4d), at 0x13d4c
[14] start__10HttpServer(0x1eebe4, 0x1, 0xffbefd04, 0x0, 0x0, 0xff21f854), at 0x18e5c
[15] main__10HttpServeriPPc(0x1eebe4, 0x1, 0xffbefd04, 0x300, 0x225b4, 0xff19bc08), at 0x19270
[16] main(0x1, 0xffbefd04, 0xffbefd0c, 0x1eebc4, 0x0, 0x0), at 0x136cc
(dbx) quit
 
#7
Hello Experts,

On a Solaris 5.10 SPARC machine, i am getting a core dump file on executing a PHP file which uses PHPExcel libraries. Following is the snippet of truss for the execution:

open("/usr/local/apache/htdocs/SMART/Classes/PHPExcel/ReferenceHelper.php", O_RDONLY) = 3
fstat(3, 0x100F15590) = 0
fstat(3, 0x100F15590) = 0
fstat(3, 0x100F15590) = 0
mmap(0x00000000, 39952, PROT_READ, MAP_SHARED, 3, 0) = 0xFFFFFFFF7D400000
brk(0x101531280) = 0
brk(0x101571280) = 0
munmap(0xFFFFFFFF7D400000, 39952) = 0
close(3) = 0
access("/usr/local/apache/htdocs/SMART/Classes/PHPExcel/WorksheetIterator.php", F_OK) = 0
access("/usr/local/apache/htdocs/SMART/Classes/PHPExcel/WorksheetIterator.php", R_OK) = 0
time() = 1436799393
lstat("/usr/local/apache/htdocs/SMART/Classes/PHPExcel/WorksheetIterator.php", 0xFFFFFFFF7FFFB180) = 0
open("/usr/local/apache/htdocs/SMART/Classes/PHPExcel/WorksheetIterator.php", O_RDONLY) = 3
fstat(3, 0x100F15590) = 0
fstat(3, 0x100F15590) = 0
fstat(3, 0x100F15590) = 0
mmap(0x00000000, 2624, PROT_READ, MAP_SHARED, 3, 0) = 0xFFFFFFFF7D600000
munmap(0xFFFFFFFF7D600000, 2624) = 0
close(3) = 0
access("/usr/local/apache/htdocs/SMART/Classes/PHPExcel/Cell.php", F_OK) = 0
access("/usr/local/apache/htdocs/SMART/Classes/PHPExcel/Cell.php", R_OK) = 0
time() = 1436799393
lstat("/usr/local/apache/htdocs/SMART/Classes/PHPExcel/Cell.php", 0xFFFFFFFF7FFFB150) = 0
open("/usr/local/apache/htdocs/SMART/Classes/PHPExcel/Cell.php", O_RDONLY) = 3
fstat(3, 0x100F15590) = 0
fstat(3, 0x100F15590) = 0
fstat(3, 0x100F15590) = 0
mmap(0x00000000, 27136, PROT_READ, MAP_SHARED, 3, 0) = 0xFFFFFFFF7D400000
munmap(0xFFFFFFFF7D400000, 27136) = 0
close(3) = 0
access("/usr/local/apache/htdocs/SMART/Classes/PHPExcel/Cell/DataType.php", F_OK) = 0
access("/usr/local/apache/htdocs/SMART/Classes/PHPExcel/Cell/DataType.php", R_OK) = 0
time() = 1436799393
lstat("/usr/local/apache/htdocs/SMART/Classes/PHPExcel/Cell/DataType.php", 0xFFFFFFFF7FFFB150) = 0
lstat("/usr/local/apache/htdocs/SMART/Classes/PHPExcel/Cell", 0xFFFFFFFF7FFFAF50) = 0
open("/usr/local/apache/htdocs/SMART/Classes/PHPExcel/Cell/DataType.php", O_RDONLY) = 3
fstat(3, 0x100F15590) = 0
fstat(3, 0x100F15590) = 0
fstat(3, 0x100F15590) = 0
mmap(0x00000000, 3538, PROT_READ, MAP_SHARED, 3, 0) = 0xFFFFFFFF7D600000
munmap(0xFFFFFFFF7D600000, 3538) = 0
close(3) = 0
Alignwrite(1, " A l i g n", 5) = 5
2write(1, " 2", 1) = 1
Incurred fault #5, FLTACCESS %pc = 0x1000EAA0C
siginfo: SIGBUS BUS_ADRALN addr=0x10123274B
Received signal #10, SIGBUS [default]
siginfo: SIGBUS BUS_ADRALN addr=0x10123274B



The machine information is : SunOS orssdp7a 5.10 Generic_150400-17 sun4u sparc SUNW,Netra-T12

However when i run the same code on another similar machine (with different Hardware though) i am getting successful. The other machine is : SunOS bhasdp6n 5.10 Generic_150400-11 sun4u sparc SUNW,Sun-Fire

Kindly suggest what could be the cause of this?

I have reinstalled all the concerned packages of apache/php which my overall code uses many times and have also compared with that which works fine.

Finally to me it seems the issue is related to machine itself. Following is the pstack output from generated core file as well:

VIBHA SINGAL
core 'core' of 29992: php test_excel.php
00000001000eaa0c compare_opcodes (101232770, 0, ffffffff7fffdcd0, ffffffff7fffc9f0, 101232770, 101232785) + 85c
00000001000ec340 auto_possessify (1012326e0, 0, ffffffff7fffdcd0, ffffffff7fffddc4, 0, 0) + 3b0
00000001000fa3ec php_pcre_compile2 (101581a38, 1, 0, ffffffff7fffe030, ffffffff7fffe02c, 101232250) + 1944
00000001000f8a90 php_pcre_compile (101581a38, 1, ffffffff7fffe030, ffffffff7fffe02c, 101232250, 101010101010101) + 3c
000000010012ddd4 pcre_get_compiled_regex_cache (101581aa0, 54, ffffffff7fffe168, ffffffff7fffe15c, ffffffff7fffe160, ffffffff7fffe158) + 86c
000000010012e6f4 php_do_pcre_match (3, 10157c1e0, 0, 0, 1, 0) + dc
000000010012f838 zif_preg_match (3, 10157c1e0, 0, 0, 1, 100ee1d00) + 48
000000010072298c zend_do_fcall_common_helper_SPEC (100ee1ee0, 100f20070, b, c0c1e31caa7e607f, 100ee1ee8, b) + 578
000000010072b220 ZEND_DO_FCALL_SPEC_CONST_HANDLER (100ee1ee0, 8, ffffffffffffffff, fffffffffffffff8, 0, 100ee1f00) + 1b0
0000000100720ce8 execute_ex (100edf2b0, 8, ffffffffffffffff, 1, 0, 100edf210) + a20
00000001007217a0 zend_execute (100f12700, 100f126c0, 2e, ffffffff7fffe89c, 4, 0) + a00
00000001006d2818 zend_execute_scripts (8, 0, 3, 0, ffffffff7ffff620, 0) + 230
000000010060c544 php_execute_script (ffffffff7ffff620, 74, 2d, 746573745f657863, 38, 101010101010101) + 3bc
000000010084bba8 do_cli (2, 100ed56d0, 2, 82, ffffffff7d800200, 0) + e50
000000010084d378 main (2, ffffffff7ffff948, ffffffff7ffff960, 100ed2808, 100000000, ffffffff7d800200) + 7c0
00000001000608c4 _start (0, 0, 0, 0, 0, 0) + 7c
9:01 PM
core 'core' of 29992: php test_excel.php

00000001000...


Need your help urgently, kindly suggest.

BR
vG
 
Top