The tale of a very interesting problem with a segmentation fault on RHEL 5.3 64bit when invoking tnsping. I initially thought the box the client was installed on (a development virtual machine) was seriously ill but it turned out to be something else altogether.
Here is the initial problem. One of the developers contacted me saying that he couldn’t connect to one of the databases. Sure enough, sqlplus wouldn’t connect:
[oracle@dev-vm-001 tns]$ sqlplus a/b@devone SQL*Plus: Release 10.2.0.1.0 - Production on Thu May 20 15:03:47 2010 Copyright (c) 1982, 2005, Oracle. All rights reserved. ERROR: ORA-12154: TNS:could not resolve the connect identifier specified Enter user-name:
I thought that this was simple enough, the probably haven’t defined the database in the tnsnames.ora file. Checking the file revealed that I was wrong!
[oracle@dev-vm-001 tns]$ cat /u01/app/oracle/product/10.2.0/client_1/network/admin/tnsnames.ora DEVONE = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = dev1db)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = DEVONE)))
So then maybe there is no connectivity to dev1db? tnsping would tell me.
[oracle@dev-vm-001 tns]$ tnsping devone TNS Ping Utility for Linux: Version 10.2.0.1.0 - Production on 20-MAY-2010 14:59:10 Copyright (c) 1997, 2005, Oracle. All rights reserved. Segmentation fault
Ouch! Segfault??? I then tried a few things that didn’t work, including a check of the sqlnet.ora file-no luck, it continued to throw segmentation faults. SQLnet.ora contained the following information, harmless enough in its own right (and no, the names.authenitcation_services wasn’t the culprit either!). To reproduce I copied the files into /home/oracle/tns and set TNS_ADMIN to that directory.
[oracle@dev-vm-001 tns]$ cat $ORACLE_HOME/network/admin/sqlnet.ora # SQLNET.ORA Network Configuration File: C:\oracle\ora92\network\admin\sqlnet.ora # Generated by Oracle configuration tools. SQLNET.AUTHENTICATION_SERVICES= (NTS) NAMES.DIRECTORY_PATH= (TNSNAMES, ONAMES, HOSTNAME)
OK, so I got the heavy artillery-strace, ltrace and ldd. Strace first:
[oracle@dev-vm-001 tns]$ strace tnsping devone execve("/u01/app/oracle/product/10.2.0/client_1/bin/tnsping", ["tnsping", "devone"], [/* 22 vars */]) = 0 brk(0) = 0xafe000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b7f9653f000 [skipped for clarity] access("/home/oracle/tns/sqlnet.ora", F_OK) = 0 open("/home/oracle/tns/sqlnet.ora", O_RDONLY) = 3 fcntl(3, F_SETFD, FD_CLOEXEC) = 0 fstat(3, {st_mode=S_IFREG|0644, st_size=267, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b7f97e8f000 read(3, "# SQLNET.ORA Network Configurati"..., 4096) = 267 read(3, "", 4096) = 0 close(3) = 0 munmap(0x2b7f97e8f000, 4096) = 0 open("/u01/app/oracle/product/10.2.0/client_1/network/names/.sdns.ora", O_RDONLY) = -1 ENOENT (No such file or directory) getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=1024}) = 0 brk(0xb40000) = 0xb40000 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++
Not too enlightening at first. So let’s try ltrace:
[oracle@dev-vm-001 tns]$ ltrace tnsping devone
__libc_start_main(0x400c28, 2, 0x7fff0a8cb298, 0x4017d0, 0x4017c0 <unfinished ...>
malloc(256) = 0xb900010
_intel_fast_memset(0x7fff0a8c9140, 0, 2000, 0xb900110, 4) = 0x7fff0a8c9140
nlstdgg(0x7fff0a8c9918, 0x7fff0a8c9140, 0xb900010, 256, 0x7fff0a8c9910
TNS Ping Utility for Linux: Version 10.2.0.1.0 - Production on 20-MAY-2010 14:59:38
Copyright (c) 1997, 2005, Oracle. All rights reserved.
) = 0
nlepeget(0x2b45a1654460, 0, 0xb900b10, 0, 0) = 0xb9010c0
nlemfireg(0xb9010c0, 0xb901120, 4, 0x40195c, 7) = 0
_setjmp(0x7fff0a8cafb0, 0, 1, 1, 1) = 0
nlepeget(0x2b45a1654460, 0, 0x7fff0a8cafa0, 0xb912b50, 1) = 0xb9010c0
nlfiini(0x2b45a1654460, 0xb9010c0, 0x7fff0a8cb118, 0xb9132d0, 1) = 0
_intel_fast_memcpy(0x7fff0a8cab50, 0x7fff0a8ccbb5, 7, 0x7fff0a8ccbbb, 0x2b45a085df6e) = 0x7fff0a8cab50
nnfsn2awanm(0x2b45a1654460, 0x7fff0a8cab50, 255, 0x7fff0a8cb0e0, 0x7fff0a8c9a50 <unfinished ...>
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++
Trying metalink and an Internet search engine revealed no hints. Someone suggested that tnsping wasn’t linked correctly. Checked the $ORACLE_HOME/install/make.log:
184 rm -f tnsping 185 gcc -o tnsping -L/u01/app/oracle/product/10.2.0/client_1/network/lib/ ...[a lot of -L and -I flags skipped] -L/u01/app/oracle/product/10.2.0/client_1/lib 186 mv -f /u01/app/oracle/product/10.2.0/client_1/bin/tnsping /u01/app/oracle/product/10.2.0/client_1/bin/tnsping0 187 mv tnsping /u01/app/oracle/product/10.2.0/client_1/bin/tnsping 188 /bin/chmod 751 /u01/app/oracle/product/10.2.0/client_1/bin/tnsping
No error either. This was getting slightly frustrating. Maybe a library was missing?
[oracle@dev-vm-001 tns]$ ldd `which tnsping` libclntsh.so.10.1 => /u01/app/oracle/product/10.2.0/client_1/lib/libclntsh.so.10.1 (0x00002b0f7d270000) libnnz10.so => /u01/app/oracle/product/10.2.0/client_1/lib/libnnz10.so (0x00002b0f7e6ea000) libdl.so.2 => /lib64/libdl.so.2 (0x000000396dc00000) libm.so.6 => /lib64/libm.so.6 (0x000000396e400000) libpthread.so.0 => /lib64/libpthread.so.0 (0x000000396e000000) libnsl.so.1 => /lib64/libnsl.so.1 (0x0000003970000000) libc.so.6 => /lib64/libc.so.6 (0x000000396d800000) /lib64/ld-linux-x86-64.so.2 (0x000000396d400000)
Checked and they all existed. I then left the problem for a little while to think :)
The Solution
It turned out that by moving sqlnet.ora out of the TNS_ADMIN directory solved the problem. An hour later I had an idea-the fiel must have been copied from a windows system as it said c:\ … in the header comment. So what if Windows line feeds kill tnsping?
[oracle@dev-vm-001 tns]$ cat -A sqlnet.ora # SQLNET.ORA Network Configuration File: C:\oracle\ora92\network\admin\sqlnet.ora^M$ # Generated by Oracle configuration tools.^M$ ^M$ NAMES.DEFAULT_DOMAIN = markit.partners^M$ ^M$ #SQLNET.AUTHENTICATION_SERVICES= (NTS)^M$ ^M$ NAMES.DIRECTORY_PATH= (TNSNAMES, ONAMES, HOSTNAME)^M$ ^M$
Spot the evil ^M? Surely it was a windows file. So let’s convert it to something good:
[oracle@dev-vm-001 tns]$ dos2unix sqlnet.ora dos2unix: converting file sqlnet.ora to UNIX format ... [oracle@dev-vm-001 tns]$ cat -A sqlnet.ora # SQLNET.ORA Network Configuration File: C:\oracle\ora92\network\admin\sqlnet.ora$ # Generated by Oracle configuration tools.$ $ NAMES.DEFAULT_DOMAIN = markit.partners$ $ #SQLNET.AUTHENTICATION_SERVICES= (NTS)$ $ NAMES.DIRECTORY_PATH= (TNSNAMES, ONAMES, HOSTNAME)$ $
The evilness now was cured. The big moment next, and voila!
[oracle@dev-vm-001 tns]$ tnsping devone
TNS Ping Utility for Linux: Version 10.2.0.1.0 - Production on 20-MAY-2010 14:46:07
Copyright (c) 1997, 2005, Oracle. All rights reserved.
Used parameter files:
/home/oracle/tns/sqlnet.ora
Used TNSNAMES adapter to resolve the alias
)SERVICE_NAME = DEVONE)TCP)(HOST = dev1db)(PORT = 1521))
OK (0 msec)
Wow. Wonder if that’s fixed in patchsets > 10.2.0.1? I couldn’t simply patch the client as that would involve regression testing on the developers’ side.
Thanks! This saved my life today!