There are times when your software is
failing but it doesn't output any error messages. Maybe you are getting a
500 error when you load a cgi script but the log is not telling you
why. Strace can often help in situations like this.
Strace does not trace the internals of your application, it only knows about syscalls, or calls into the kernel. This can be enough in many cases to show you when your application might be trying to look for a file that is not there, or when it cannot allocate memory or some other error.
Generally I call strace with option -f so it will follow child processes, and then the name of the command to trace. To save the output to a file, use -o.
The first column shows the process id, and then you see the
syscall and its result. Here it opens /etc/ld.so.cache in readonly mode,
and the result is file descriptor 3. After this it calls fstat64 and
old_mmap on file descriptor 3. Then it closes file descriptor 3. Once it
has been closed, it can be reused as you see in the next call where it
opens /lib/tls/librt.so.1.
In cases where you want to trace a process that is already running, pass the -p option to strace. I will run strace on Apache.
Here I have the process IDs for apache running on my box. Which
one to trace? Pick one, then keep reloading your web site until the
process you are tracing is the one that serves up your page. If the
process does not belong to you, you will probably need to be root in
order to trace it.
I ran strace in the background by using & so I could view the
output of strace.out with tail -f. This way I could know when my process
was the one that was doing the work. Then I reloaded my site about 5
times until it cycled around to process 6457.
In the beginning of the output shown, it may look like there is an error, because it keeps showing "No such file or directory" while trying to load libperl.so. But you can see that it is looking for the library in multiple paths and eventually it does find it in the directory /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/. In this case, the page was loaded without any errors. I will do it again, but this time I will try to load a page that does not exist, in this case http://10.2.4.15/test.
It tried to find /usr/local/www/test but could not so it failed in the end.
Strace does not trace the internals of your application, it only knows about syscalls, or calls into the kernel. This can be enough in many cases to show you when your application might be trying to look for a file that is not there, or when it cannot allocate memory or some other error.
Generally I call strace with option -f so it will follow child processes, and then the name of the command to trace. To save the output to a file, use -o.
Code:
[josephw@myhost ~]$ strace -f -o strace.out ls / bin dev home lib media mnt proc sbin space sys usr boot etc initrd lost+found misc opt root selinux srv tmp var [josephw@myhost ~]$ less strace.out 6368 execve("/bin/ls", ["ls", "/"], [/* 17 vars */]) = 0 6368 uname({sys="Linux", node="dss03.spry.com", ...}) = 0 6368 brk(0) = 0x8cc5000 6368 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) 6368 open("/etc/ld.so.cache", O_RDONLY) = 3 6368 fstat64(3, {st_mode=S_IFREG|0644, st_size=36486, ...}) = 0 6368 old_mmap(NULL, 36486, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f8f000 6368 close(3) = 0 6368 open("/lib/tls/librt.so.1", O_RDONLY) = 3 ....output trimmed....
In cases where you want to trace a process that is already running, pass the -p option to strace. I will run strace on Apache.
Code:
[josephw@myhost ~]$ ps aux |grep http apache 6457 0.0 0.5 18764 5456 ? S Oct31 0:00 /usr/sbin/httpd apache 6498 0.0 0.5 18764 5464 ? S Oct31 0:00 /usr/sbin/httpd apache 6500 0.0 0.5 18768 6000 ? S Oct31 0:00 /usr/sbin/httpd apache 6505 0.0 0.5 18768 6080 ? S Oct31 0:00 /usr/sbin/httpd apache 6507 0.0 0.5 18764 5464 ? S Oct31 0:00 /usr/sbin/httpd
Code:
[root@myhost ~]# strace -f -o strace.out -p 6457 & [root@myhost ~]# tail -f strace.out ....output trimmed.... 21361 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) 21361 open("/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/tls/i686/sse2/libperl.so", O_RDONLY) = -1 ENOENT (No such file or directory) 21361 stat64("/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/tls/i686/sse2", 0xbfeca780) = -1 ENOENT (No su ch file or directory) 21361 open("/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/tls/i686/libperl.so", O_RDONLY) = -1 ENOENT (No such file or directory) 21361 stat64("/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/tls/i686", 0xbfeca780) = -1 ENOENT (No such fi le or directory) 21361 open("/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/tls/sse2/libperl.so", O_RDONLY) = -1 ENOENT (No such file or directory) 21361 stat64("/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/tls/sse2", 0xbfeca780) = -1 ENOENT (No such fi le or directory) 21361 open("/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/tls/libperl.so", O_RDONLY) = -1 ENOENT (No such file or directory) 21361 stat64("/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/tls", 0xbfeca780) = -1 ENOENT (No such file or directory) 21361 open("/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/i686/sse2/libperl.so", O_RDONLY) = -1 ENOENT (No such file or directory) 21361 stat64("/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/i686/sse2", 0xbfeca780) = -1 ENOENT (No such f ile or directory) 21361 open("/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/i686/libperl.so", O_RDONLY) = -1 ENOENT (No such file or directory) 21361 stat64("/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/i686", 0xbfeca780) = -1 ENOENT (No such file o r directory) 21361 open("/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/sse2/libperl.so", O_RDONLY) = -1 ENOENT (No such file or directory) 21361 stat64("/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/sse2", 0xbfeca780) = -1 ENOENT (No such file o r directory) 21361 open("/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/libperl.so", O_RDONLY) = 3 ....output trimmed....
In the beginning of the output shown, it may look like there is an error, because it keeps showing "No such file or directory" while trying to load libperl.so. But you can see that it is looking for the library in multiple paths and eventually it does find it in the directory /usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE/. In this case, the page was loaded without any errors. I will do it again, but this time I will try to load a page that does not exist, in this case http://10.2.4.15/test.
Code:
....output trimmed.... 6457 lstat64("/usr/local/www/test", 0xbffb3104) = -1 ENOENT (No such file or directory) ....output trimmed.... 6457 read(18, 0xbffb3270, 512) = -1 ECONNRESET (Connection reset by peer) 6457 close(18) = 0 6457 read(5, 0xbffb347f, 1) = -1 EAGAIN (Resource temporarily unavailable) 6457 semop(7307267, 0x1c9740, 1 <unfinished ...>