/************************************************************************/
/* Document     : Short note on how to trace processes in UNIX.         */ 
/* Doc. Version : 4                                                     */
/* File         : tracing.txt                                           */
/* Purpose      : Some examples on how to trace processes in UNIX.      */
/*                For the DBA working with databases on UNIX.           */
/* Date         : 14/08/2009                                            */
/* Compiled by  : Albert van der Sel                                    */
/************************************************************************/


============================================================================
1. First some info before you trace:
============================================================================ 


When you study your trace files, you may come accross a number of error messages or error codes.
The errorcodes we mean here, are the codes that are also visible in the file "errno.h". 
This is a header file in the standard library of C programming language. 

Those are a subset of the codes that a program might get when it requests a service 
from the system (like for example, "open file").

That's certainly is not all there is that you might run into about errors and corresponding codes, 
but it constitues an important base of what you can encounter in traces.

Suppose you find something like this in a trace:

  vnop_lookup(dvp = F100010034228BF8, flag = 0002) = 0002, *vpp = 0000
  return from statx. error ENOENT [13 usec]

What can ENOENT mean? If you don't find some more "explaining text" 'close' to this line, then you can find
from the table below, that it means "No such file or directory".

Actually, I produced 2 lists, one from Linux and one from AIX, 
just to prove they are quite the same (there is no garantee that they are *exactly* the same on all systems).

By the way, if you go search for that "errno.h" file (or similar name), on your own system, 
and take a look at the contents, you can create the list yourself for your particular unix/linux system.
You can find that file (likely) in "/usr/include/sys" 
But for easy reference, we list the most important errno's for 2 representative unixes.
(Yes.. one listing would have been quite sufficient).


1.1 Errcodes Linux (generic) from errno.h :
===========================================


#define EPERM            1      /* Operation not permitted */
#define ENOENT           2      /* No such file or directory */
#define ESRCH            3      /* No such process */
#define EINTR            4      /* Interrupted system call */
#define EIO              5      /* I/O error */
#define ENXIO            6      /* No such device or address */
#define E2BIG            7      /* Arg list too long */
#define ENOEXEC          8      /* Exec format error */
#define EBADF            9      /* Bad file number */
#define ECHILD          10      /* No child processes */
#define EAGAIN          11      /* Try again */
#define ENOMEM          12      /* Out of memory */
#define EACCES          13      /* Permission denied */
#define EFAULT          14      /* Bad address */
#define ENOTBLK         15      /* Block device required */
#define EBUSY           16      /* Device or resource busy */
#define EEXIST          17      /* File exists */
#define EXDEV           18      /* Cross-device link */
#define ENODEV          19      /* No such device */
#define ENOTDIR         20      /* Not a directory */
#define EISDIR          21      /* Is a directory */
#define EINVAL          22      /* Invalid argument */
#define ENFILE          23      /* File table overflow */
#define EMFILE          24      /* Too many open files */
#define ENOTTY          25      /* Not a typewriter */
#define ETXTBSY         26      /* Text file busy */
#define EFBIG           27      /* File too large */
#define ENOSPC          28      /* No space left on device */
#define ESPIPE          29      /* Illegal seek */
#define EROFS           30      /* Read-only file system */
#define EMLINK          31      /* Too many links */
#define EPIPE           32      /* Broken pipe */
#define EDOM            33      /* Math argument out of domain of func */
#define ERANGE          34      /* Math result not representable */
#define EDEADLK         35      /* Resource deadlock would occur */
#define ENAMETOOLONG    36      /* File name too long */
#define ENOLCK          37      /* No record locks available */
#define ENOSYS          38      /* Function not implemented */
#define ENOTEMPTY       39      /* Directory not empty */
#define ELOOP           40      /* Too many symbolic links encountered */
#define EWOULDBLOCK     EAGAIN  /* Operation would block */
#define ENOMSG          42      /* No message of desired type */
#define EIDRM           43      /* Identifier removed */
#define ECHRNG          44      /* Channel number out of range */
#define EL2NSYNC        45      /* Level 2 not synchronized */
#define EL3HLT          46      /* Level 3 halted */
#define EL3RST          47      /* Level 3 reset */
#define ELNRNG          48      /* Link number out of range */
#define EUNATCH         49      /* Protocol driver not attached */
#define ENOCSI          50      /* No CSI structure available */
#define EL2HLT          51      /* Level 2 halted */
#define EBADE           52      /* Invalid exchange */
#define EBADR           53      /* Invalid request descriptor */
#define EXFULL          54      /* Exchange full */
#define ENOANO          55      /* No anode */
#define EBADRQC         56      /* Invalid request code */
#define EBADSLT         57      /* Invalid slot */
#define EDEADLOCK       EDEADLK
#define EBFONT          59      /* Bad font file format */
#define ENOSTR          60      /* Device not a stream */
#define ENODATA         61      /* No data available */
#define ETIME           62      /* Timer expired */
#define ENOSR           63      /* Out of streams resources */
#define ENONET          64      /* Machine is not on the network */
#define ENOPKG          65      /* Package not installed */
#define EREMOTE         66      /* Object is remote */
#define ENOLINK         67      /* Link has been severed */
#define EADV            68      /* Advertise error */
#define ESRMNT          69      /* Srmount error */
#define ECOMM           70      /* Communication error on send */
#define EPROTO          71      /* Protocol error */
#define EMULTIHOP       72      /* Multihop attempted */
#define EDOTDOT         73      /* RFS specific error */
#define EBADMSG         74      /* Not a data message */
#define EOVERFLOW       75      /* Value too large for defined data type */
#define ENOTUNIQ        76      /* Name not unique on network */
#define EBADFD          77      /* File descriptor in bad state */
#define EREMCHG         78      /* Remote address changed */
#define ELIBACC         79      /* Can not access a needed shared library */
#define ELIBBAD         80      /* Accessing a corrupted shared library */
#define ELIBSCN         81      /* .lib section in a.out corrupted */
#define ELIBMAX         82      /* Attempting to link in too many shared libraries */
#define ELIBEXEC        83      /* Cannot exec a shared library directly */
#define EILSEQ          84      /* Illegal byte sequence */
#define ERESTART        85      /* Interrupted system call should be restarted */
#define ESTRPIPE        86      /* Streams pipe error */
#define EUSERS          87      /* Too many users */
#define ENOTSOCK        88      /* Socket operation on non-socket */
#define EDESTADDRREQ    89      /* Destination address required */
#define EMSGSIZE        90      /* Message too long */
#define EPROTOTYPE      91      /* Protocol wrong type for socket */
#define ENOPROTOOPT     92      /* Protocol not available */
#define EPROTONOSUPPORT 93      /* Protocol not supported */
#define ESOCKTNOSUPPORT 94      /* Socket type not supported */
#define EOPNOTSUPP      95      /* Operation not supported on transport endpoint */
#define EPFNOSUPPORT    96      /* Protocol family not supported */
#define EAFNOSUPPORT    97      /* Address family not supported by protocol */
#define EADDRINUSE      98      /* Address already in use */
#define EADDRNOTAVAIL   99      /* Cannot assign requested address */
#define ENETDOWN        100     /* Network is down */
#define ENETUNREACH     101     /* Network is unreachable */
#define ENETRESET       102     /* Network dropped connection because of reset */
#define ECONNABORTED    103     /* Software caused connection abort */
#define ECONNRESET      104     /* Connection reset by peer */
#define ENOBUFS         105     /* No buffer space available */
#define EISCONN         106     /* Transport endpoint is already connected */
#define ENOTCONN        107     /* Transport endpoint is not connected */
#define ESHUTDOWN       108     /* Cannot send after transport endpoint shutdown */
#define ETOOMANYREFS    109     /* Too many references: cannot splice */
#define ETIMEDOUT       110     /* Connection timed out */
#define ECONNREFUSED    111     /* Connection refused */
#define EHOSTDOWN       112     /* Host is down */
#define EHOSTUNREACH    113     /* No route to host */
#define EALREADY        114     /* Operation already in progress */
#define EINPROGRESS     115     /* Operation now in progress */
#define ESTALE          116     /* Stale NFS file handle */
#define EUCLEAN         117     /* Structure needs cleaning */
#define ENOTNAM         118     /* Not a XENIX named type file */
#define ENAVAIL         119     /* No XENIX semaphores available */
#define EISNAM          120     /* Is a named type file */
#define EREMOTEIO       121     /* Remote I/O error */
#define EDQUOT          122     /* Quota exceeded */
#define ENOMEDIUM       123     /* No medium found */
#define EMEDIUMTYPE     124     /* Wrong medium type */


The list above should actually be sufficient, but we shall show next, the corresponding
list for AIX (a bit nonsense ofcourse):


1.2 errcodes AIX:
=================


#define EPERM   1       /* Operation not permitted              */
#define ENOENT  2       /* No such file or directory            */
#define ESRCH   3       /* No such process                      */
#define EINTR   4       /* interrupted system call              */
#define EIO     5       /* I/O error                            */
#define ENXIO   6       /* No such device or address            */
#define E2BIG   7       /* Arg list too long                    */
#define ENOEXEC 8       /* Exec format error                    */
#define EBADF   9       /* Bad file descriptor                  */
#define ECHILD  10      /* No child processes                   */
#define EAGAIN  11      /* Resource temporarily unavailable     */
#define ENOMEM  12      /* Not enough space                     */
#define EACCES  13      /* Permission denied                    */
#define EFAULT  14      /* Bad address                          */
#define ENOTBLK 15      /* Block device required                */
#define EBUSY   16      /* Resource busy                        */
#define EEXIST  17      /* File exists                          */
#define EXDEV   18      /* Improper link                        */
#define ENODEV  19      /* No such device                       */
#define ENOTDIR 20      /* Not a directory                      */
#define EISDIR  21      /* Is a directory                       */
#define EINVAL  22      /* Invalid argument                     */
#define ENFILE  23      /* Too many open files in system        */
#define EMFILE  24      /* Too many open files                  */
#define tr  25      /* Inappropriate I/O control operation  */
#define ETXTBSY 26      /* Text file busy                       */
#define EFBIG   27      /* File too large                       */
#define ENOSPC  28      /* No space left on device              */
#define ESPIPE  29      /* Invalid seek                         */
#define EROFS   30      /* Read only file system                */
#define EMLINK  31      /* Too many links                       */
#define EPIPE   32      /* Broken pipe                          */
#define EDOM    33      /* Domain error within math function    */
#define ERANGE  34      /* Result too large                     */
#define ENOMSG  35      /* No message of desired type           */
#define EIDRM   36      /* Identifier removed                   */
#define ECHRNG  37      /* Channel number out of range          */
#define EL2NSYNC 38     /* Level 2 not synchronized             */
#define EL3HLT  39      /* Level 3 halted                       */
#define EL3RST  40      /* Level 3 reset                        */
#define ELNRNG  41      /* Link number out of range             */
#define EUNATCH 42      /* Protocol driver not attached         */
#define ENOCSI  43      /* No CSI structure available           */
#define EL2HLT  44      /* Level 2 halted                       */
#define EDEADLK 45      /* Resource deadlock avoided            */
#define ENOTREADY       46      /* Device not ready             */
#define EWRPROTECT      47      /* Write-protected media        */
#define EFORMAT         48      /* Unformatted media            */
#define ENOLCK          49      /* No locks available           */
#define ENOCONNECT      50      /* no connection                */
#define ESTALE          52      /* no filesystem                */
#define EDIST           53      /* old, currently unused AIX errno*/
#define EINPROGRESS     55      /* Operation now in progress */
#define EALREADY        56      /* Operation already in progress */
#define ENOTSOCK        57      /* Socket operation on non-socket */
#define EDESTADDRREQ    58      /* Destination address required */
#define EDESTADDREQ     EDESTADDRREQ /* Destination address required */
#define EMSGSIZE        59      /* Message too long */
#define EPROTOTYPE      60      /* Protocol wrong type for socket */
#define ENOPROTOOPT     61      /* Protocol not available */
#define EPROTONOSUPPORT 62      /* Protocol not supported */
#define ESOCKTNOSUPPORT 63      /* Socket type not supported */
#define EOPNOTSUPP      64      /* Operation not supported on socket */
#define EPFNOSUPPORT    65      /* Protocol family not supported */
#define EAFNOSUPPORT    66      /* Address family not supported by protocol family */
#define EADDRINUSE      67      /* Address already in use */
#define EADDRNOTAVAIL   68      /* Can't assign requested address */
#define ENETDOWN        69      /* Network is down */
#define ENETUNREACH     70      /* Network is unreachable */
#define ENETRESET       71      /* Network dropped connection on reset */
#define ECONNABORTED    72      /* Software caused connection abort */
#define ECONNRESET      73      /* Connection reset by peer */
#define ENOBUFS         74      /* No buffer space available */
#define EISCONN         75      /* Socket is already connected */
#define ENOTCONN        76      /* Socket is not connected */
#define ESHUTDOWN       77      /* Can't send after socket shutdown */
#define ETIMEDOUT       78      /* Connection timed out */
#define ECONNREFUSED    79      /* Connection refused */
#define EHOSTDOWN       80      /* Host is down */
#define EHOSTUNREACH    81      /* No route to host */
#define ERESTART        82      /* restart the system call */
#define EPROCLIM        83      /* Too many processes */
#define EUSERS          84      /* Too many users */
#define ELOOP           85      /* Too many levels of symbolic links      */
#define ENAMETOOLONG    86      /* File name too long                     */
#define EDQUOT          88      /* Disc quota exceeded */
#define ECORRUPT        89      /* Invalid file system control data */
#define EREMOTE         93      /* Item is not local to host */
#define ENOSYS          109     /* Function not implemented  POSIX */
#define EMEDIA          110     /* media surface error */
#define ESOFT           111     /* I/O completed, but needs relocation */
#define ENOATTR         112     /* no attribute found */
#define ESAD            113     /* security authentication denied */
#define ENOTRUST        114     /* not a trusted program */
#define ETOOMANYREFS    115     /* Too many references: can't splice */
#define EILSEQ          116     /* Invalid wide character */
#define ECANCELED       117     /* asynchronous i/o cancelled */
#define ENOSR           118     /* temp out of streams resources */
#define ETIME           119     /* I_STR ioctl timed out */
#define EBADMSG         120     /* wrong message type at stream head */
#define EPROTO          121     /* STREAMS protocol error */
#define ENODATA         122     /* no message ready at stream head */
#define ENOSTR          123     /* fd is not a stream */
#define ECLONEME        ERESTART /* this is the way we clone a stream ... */
#define ENOTSUP         124     /* POSIX threads unsupported value */
#define EMULTIHOP       125     /* multihop is not allowed */
#define ENOLINK         126     /* the link has been severed */
#define EOVERFLOW       127     /* value too large to be stored in data type */


Actually, this is only a very small list of errors and code: 
It is ONLY associated with the interaction of a process with the system. 
And even in that context, this is a limited list.

There are ofcourse also many classes of errors you will never see in a trace.
Think of the possible errors that can be seen at boottime of a system, or what an 
error logging daemon might write in a logfile, can all be a very different story.


============================================================================
2. A quick one: The "truss" tool on many unixes:
============================================================================ 


Here is a quick one to trace a shell script, or executable program: using "truss".

The "truss" tool is available on many unix platforms. It has many options, but a very usefull command
to trace the system calls that a script or program does is:


$ truss -o /tmp/myprg.log myprg


In this example, truss will log in the file "/tmp/myprg.log" while it traces the program "myprg".

Ofcourse, you can choose another path and logfile to trace to.

The upper command is quite good for tracing a shell script, or program, that starts up, does some work,
and then terminates. If an error occurs during runtime, it's likely that you find some pointers 
in the logfile that truss made for you.

In the example above, you started the trace while activating the program at the same time.

You can attach to an existing process using the "pid", with the "-p" flag.

$ truss -p 5743


This tool has so many options, for example, you can focus your trace on a certain library etc..
Anyway, even the upper example of truss can already be very helpfull.

So, for example, if you find in the log that truss has produced, the error "EACCES" which is
"errno 13 = Permission denied", that would really be helpfull. Obviously, your shell script or
program tries to access a certain object, to which it has insufficient permisions, 
and thus may fail. 

Be warned though, that some errno's might be found multiple times, while it's actually not
something to worry about. For example "ENOENT= No such file or directory" might be found
quite often. Here, your script or program seems to be unable to find a file or directory.
Well, if it's related to the $PATH environment variable, it could be quite reasonable.
Your shell will search your $PATH from beginning, to the end, until the object has been found.
Thus, it's quite possible that some ENOENT errors occurred.  

In section 4.2 you can find some more info on truss.


============================================================================
3. Tracing in Linux:
============================================================================ 


3.1.strace:
===========


>>> strace example on Linux:

One main trace utility on most Linux distro's, is the "strace" command.
You can use it with many parameters, but the "-o outputfile" is very important, in order to save the output to a file.

Use it like:

# strace -o logfile <name_of_command_or_program_you_want_to_trace> 

# strace -o logfile -p <process_id>     # In cases where you want to trace a process that is already running, 
                                        # pass the -p option to strace.


Because strace will show you the systemcalls and signals, you can use it to reveal whether a program cannot
find a file, or does not have permissions to read (or write to) a file. In such a case, a program might fail.


Example 1:
----------

Suppose we have a file called "/etc/security.conf". Now we run a utility to read the file (like cat, pg, more, less etc..)
as a normal user, which user does not have permissions to read the file. Let's trace that event to a logfile, and see
what we can discover.

$ strace -o strace_example.log less /etc/security.conf

A trace file can get pretty long, but you should just browse it and be alert on what seems to be an error reported.
So, if we take a look in the logfile "strace_example.log"

  ..
  ..
  open("/etc/security.conf", O_RDONLY|O_LARGEFILE) = -1 EACCES (Permission denied)
  write(2, "/etc/security.conf: Permission denied\n", 32) = 32
  ..
  ..

We can clearly see, that our program failed due to lack of permission.

Example 2:
----------

You can use strace in many ways. One other famous "error" you might find using strace, is that a program needs a libary,
but can't find it.
Like in this example;

  ..
  open("/opt/tux/cbl/lib/libdcpybk.so", O_RDONLY) = -1 
  ENOENT (No such file or directory)
  ..

Remark:

To find out what libraries a program needs, you might also try the ldd command.
For example, what uuencode needs is shown with:

$ ldd uuencode
uuencode needs:
         /usr/lib/libc.a(shr.o)
         /unix
         /usr/lib/libcrypt.a(shr.o)


3.2. ltrace:
============

While "strace" deals with systemcalls, if you want to track what library calls an application does, 
you can use the "ltrace" command.
It works really similar to "strace".

Example:

$ ltrace -o ls_example_trace_file.trc ls


3.3. LTT Linux Trace Toolkit:
=============================

Strace, as we have seen above, will trace only one process and present the result in text form. To trace many processes in
a given period of time, Linux Trace Toolkit (LTT) is a better choice. LTT is distributed as free software under GPL. 
The trace toolkit provides a daemon, which will capture the events and write it to disk. 

It's (generally) not a standard feature of Linux, and you need to obtain it elswhere. If you are interested, just Google on
Linux Trace Toolkit, to find current info.

Basically, you run the tracedaemon, and after a while, you use the tracevisualizer to view results
in graphical form. 


3.4. Other possible usefull Linux commands (limited list):
==========================================================

Although not directly related to tracing, the following limited list of commands might help in creating a better view of
your system and processes. I am sure you are familiair with them, but let's list them anyway:

-- Show your OS version:

# cat /proc/version 
# uname -a

-- Show the open files that a process uses:

# pfiles pid

-- Show the jobs that are scheduled (in the account you use) from cron:

# crontab -l

-- What are the standard mounted filesystems: That's defined in "/etc/fstab"

# cat /etc/fstab

-- Which processes are using a certain filesystem?

# fuser -c /filesystem     # We mean the "mountpoint", like for example "/apps/oracle"

-- Show memory usage of a process:

# pmap -d pid                       # (Most important options: -x  Show the extended format; -d Show the device format.)
                                    # (And pid is the process-id, as visible in the command "ps -ef".)

-- Show system memory:

# cat /proc/meminfo
# /usr/sbin/dmesg | grep "Physical"
# free                              # (the free command)   

-- Swap usage:

# cat /proc/swaps                   # Above 60%-70% it's getting scary
# cat /proc/meminfo

-- cpu info:

# cat /proc/cpuinfo

-- user and process limits:

Sometimes, when a process runs under some account, and it fails for no immediate reason, it might be
worth checking the "ulimit" of that account (like max filesize, max open files, number of files etc..)
use it under that account as:

# ulimit (-a)

-- Show processtree of parent and children:

# pstree pid                       # on some distros ptree is implemented


-- Show the system error report / error log:

# cat /var/log/messages | more    (# more will ensure that not all contents scroll at your screen "at once", until the end is reached)


-- Determine the type of a file (e.g. is it ascii, or another type of file?)

# file file_name                  # (the command is really named "file")


-- Show free/used space of the filesystems:

# df -m                           # m in MB; k in KB

If there are many filesystems, you might want to see just the top 5 that are the lowest on free space:

# df -k |awk '{print $4,$7}' |grep -v "Filesystem" | sort -n | tail -5

-- How to become another user, or possibly root:

# su - accountname       # (switch to that accountname like "su - albert")
# su -                   # (switch to root)
                         # if the sudo utility is implemented, you might try the command "sudo -l" to see what you might execute.

-- Carefull!! How to kill a process "the hard way"?

# kill -9 PID              # carefull, don't kill the wrong one; not recommended unless you don't have a choice.

-- Carefull!! How to kill all your processes "the hard way", all at once?

# kill -9 -1               # very carefull; not recommended unless you don't have a choice.
# killall                  # implemented on some distros. very carefull; not recommended unless you don't have a choice.

-- Show your uid (userid) and gid (groupid):

# id

-- refreshing (restarting) inetd after modifying "/etc/inetd.conf"

# service xinetd restart	    # depending on the distro, like RedHat					
# /etc/init.d/inetd restart	

-- To show the init runlevel:

# who -r 

-- Show uptime of system plus average load (15 minutes)

# uptime

-- Show the last logged on users: account name & pts & date (history since last restart)

# last | more


============================================================================
4. Tracing in AIX:
============================================================================

In AIX, tracing commands are available like "truss", "syscalls" and "trace".

First we will talk about the "trace" facility, to which AIX also offers a userfriendly 
interface. It's a menu based system (via smitty). But you can use "trace" on the commandline as well.
The neat thing here is that you can trace a PID, a program, or just all.

We will start with the command "smitty trace". We will instruct the system to create 
a raw tracefile first (not easily readable), and then, after we have stopped tracing, we create
an ascii (readable) file, from the raw file.


4.1. Setting up a trace with "smitty trace":
============================================


>>> Define and start the trace:
-------------------------------

You can start with

$ smitty trace

The following menu appears:

Move cursor to desired item and press Enter.

  START Trace
  STOP Trace
  Generate a Trace Report
  Manage Trace
  Manage Event Groups


First we choose "START Trace"

The following menu appears:

FIG. 1.

                                                        [Entry Fields]
  EVENT GROUPS to trace                              []               
  ADDITIONAL event IDs to trace                      []               
  Event Groups to EXCLUDE from trace                 []               
  Event IDs to EXCLUDE from trace                    []               
->Process IDs to Trace                               []               
  Program to Trace                                   []
  Propagate Tracing to                               [new processes and threads] 
  Trace MODE                                         [alternate]                 
  STOP when log file full?                           [no]                        
  LOG FILE                                           [/var/adm/ras/trcfile]
  SAVE PREVIOUS log file?                            [no]                  
  Omit PS/NM/LOCK HEADER to log file?                [yes]                 
  Omit DATE-SYSTEM HEADER to log file?               [no]                  
  Run in INTERACTIVE mode?                           [no]                  
  Trace BUFFER SIZE in bytes                         [262144]              
  LOG FILE SIZE in bytes                             [2621440]             
  Buffer Allocation                                  [automatic]  

Now move to the item:


- Process ID to Trace:

Hopefully, you know the "pid", or "process id", of the process you want to trace.
Maybe with the "ps -ef" command, you can find the pid.
If you do not specify a particular pid, your trace is going to capture almost all processes,
which ofcourse can lead to incredably large and fast growing traces.
In this example, we do not fill in a pid. Normally, you should always choose the pid you want
to trace.

Next, move to the item:

- "LOG FILE":

Now we adjust the logfile location from the default "/var/adm/ras/trcfile" to another suitable filesystem and filename,
like "/tmp/trcraw" (the /var filesystem is usually not a good idea to store your own large tracefile)
In this example, we use "/tmp" as the filesystem to store our tracefile (if there is enough free space).
And we let the tracefile has the name of "trcraw", because it will not contain readable text (at first),
hence the "raw".

Next, move to the item:

- "LOG FILE SIZE in bytes":

It might be a good idea to limit the size of the tracefile. For exmple, if you only have 1GB free in /tmp,
you must stay well below that size.
But you will see that tracing to file, is like "exploding" the filesize. It can grow incredibly fast, also
depending on the event groups you trace.
Undoubtly, you will see that for yourself. If you trace on too many events, it can be as bad as 500MB per minute.
But in this example, we stay "modest" in sizes.

So here, we have taken the example value of 100MB (104857600 bytes)


FIG. 2.
                                                        [Entry Fields]
  EVENT GROUPS to trace                              []      
  ADDITIONAL event IDs to trace                      []      
  Event Groups to EXCLUDE from trace                 []      
  Event IDs to EXCLUDE from trace                    []      
  Process IDs to Trace                               []      
  Program to Trace                                   []
  Propagate Tracing to                               [new processes and threads]     
  Trace MODE                                         [alternate]                     
  STOP when log file full?                           [yes]                           
  LOG FILE                                           [/tmp/trcraw]
  SAVE PREVIOUS log file?                            [no]         
  Omit PS/NM/LOCK HEADER to log file?                [yes]        
  Omit DATE-SYSTEM HEADER to log file?               [no]         
  Run in INTERACTIVE mode?                           [no]         
  Trace BUFFER SIZE in bytes                         [262144]     
  LOG FILE SIZE in bytes                             [104857600]        (changed to 100MB)                                                                                         #
  Buffer Allocation                                  [automatic]   

Next, move to

- "STOP when log file full?"

Decide whether you want to stop logging when the size limit has been reached (generally a good idea).
You can choose between "yes" and "no" via the F4 key.

Next, we move to 

- "EVENT GROUPS to trace":

When you have your cursor at this item, press F4. An impressive list of "counters" or trace-able events, is shown.
With the F7 key, you can toggle "Select event" to on/off.
Remember, the more event(groups) you choose, the more "intensive" the system will trace, and the faster
your tracefile will grown.

believe me: if you want to create a relatively simple trace for troubleshooting purposes, then the selection of
- fop - FILE OPENS (reserved)
- fact - FILE ACTIVITY (open,close,read,write) (reserved)

can be sufficient. Because many process failures are related to permission problems (on files and directories) and
not able to find files (like libaries, logfiles etc..).

So, in this we just choose those eventgroups, and press Enter.


FIG. 3.

                                         +--------------------------------------------------------------------------+
                                         �                          EVENT GROUPS to trace                           �
                                         �                                                                          �
                                         � Move cursor to desired item and press F7. Use arrow keys to scroll.      �
  EVENT GROUPS to trace                  �     ONE OR MORE items can be selected.                                   �
  ADDITIONAL event IDs to trace          � Press Enter AFTER making all selections.                                 �
  Event Groups to EXCLUDE from trace     �                                                                          �
  Event IDs to EXCLUDE from trace        � [TOP]                                                                    �
  Process IDs to Trace                   �   tidhk - Hooks needed to display thread name (reserved)                 �
  Program to Trace                       �   gka - GENERAL KERNEL ACTIVITY (files,execs,dispatches) (reserved)      �
  Propagate Tracing to                   �   gkasc - GENERAL KERNEL ACTIVITY + SYSTEM CALLS (reserved)              �
  Trace MODE                             �   fop - FILE OPENS (reserved)                                            �
  STOP when log file full?               �   fact - FILE ACTIVITY (open,close,read,write) (reserved)                �
  LOG FILE                               �   proc - EXECS, FORKS, EXITS (reserved)                                  �
  SAVE PREVIOUS log file?                �   procd - EXECS, FORKS, DISPATCHES (reserved)                            �
  Omit PS/NM/LOCK HEADER to log file?    �   filephys - FILE ACTIVITY (with physical file system) (reserved)        �
  Omit DATE-SYSTEM HEADER to log file?   �   filepfsv - FILE ACTIVITY (with physical file system and VMM) (reserved �
  Run in INTERACTIVE mode?               �   filepvl - FILE ACTIVITY (with physical file system, VMM, and LVM) (res �
  Trace BUFFER SIZE in bytes             �   filepvld - FILE ACTIVITY (w/ phys. file sys., VMM, LVM, and disk) (res �
  LOG FILE SIZE in bytes                 �   syscall - SYSTEM CALLS (reserved)                                      �
  Buffer Allocation                      �   inthands - FLIHS and SLIHS (reserved)                                  �
                                         �   lfs - LOGICAL FILE SYSTEM (deprecated, use vnops and vfsops) (reserved �
                                         �   pfs - PHYSICAL FILE SYSTEM (reserved)                                  �
                                         �   vmm - VIRTUAL MEMORY MANAGER (reserved)                                �
                                         �   vmmsvc - VMM SERVICES (reserved)                                       �
                                         �   lvm - LOGICAL VOLUME MANAGER (reserved)                                �
                                         �   lvmbb - LOGICAL VOLUME MANAGER BADBLOCK EVENTS (reserved)              �
                                         �   ipcgen - IPC: GENERAL (reserved)                                       �
                                         �   ipcsm - IPC: SHARED MEMORY (reserved)                                  �
                                         �   ipcmsgs - IPC: MESSAGES (reserved)                                     �
                                         �   ipcsem - IPC: SEMAPHORES (reserved)                                    �
                                         �   ipcmmap - IPC: MMAP (reserved)                                         �
                                         �   ipcmsem - IPC: MSEMAPHORES (reserved)                                  �
                                         �   errlg - ERROR LOGGING (reserved)                                       �
                                         �   parpdd - DEVICE DRIVER: PARALLEL PRINTER (reserved)                    �
                                         �   tapedd - DEVICE DRIVER: TAPE (reserved)                                �
                                         �   entdd - DEVICE DRIVER: ETHERNET - HIGH PERFORMANCE LAN ADAPTER (8ef5)  �
                                         �   tokdd - DEVICE DRIVER: TOKEN RING - HIGH PERFORMANCE ADAPTER (8fc8) (r �
                                         �   c3270dd - DEVICE DRIVER: C3270 (reserved)                              �
                                         �   fddd - DEVICE DRIVER: FLOPPY DISK (reserved)                           �
                                         �   scsidd - DEVICE DRIVER: SCSI (reserved)                                �
                                         �   sisadd - DEVICE DRIVER: PCI-X SCSI (reserved)                          �
                                         �   sissasdd - DEVICE DRIVER: SAS (reserved)                               �
                                         �   diskdd - DEVICE DRIVER: DISK (reserved)                                �
                                         �   mpqdd - DEVICE DRIVER: MULTI-PROTOCAL ADAPTERS (reserved)              �
                                         �   graphdd - DEVICE DRIVER: GRAPHICS (reserved)                           �
                                         �   ttydd - DEVICE DRIVER: pty (reserved)                                  �
                                         �   rs232dd - DEVICE DRIVER: rs232 (reserved)                              �
                                         �   64portdd - DEVICE DRIVER: 64 PORT ASYNC CONTROLLER (reserved)          �
                                         �   x25dd - DEVICE DRIVER: X25 (reserved)                                  �
                                         �   harierdd - DEVICE DRIVER: HARRIER2 (reserved)                          �
                                         �   scsitgdd - DEVICE DRIVER: SCSI Target Mode (reserved)                  �
                                         �   lpfkdd - DEVICE DRIVER: Dials/LPFKeys (reserved)                       �
                                         � [MORE...36]                                                              �
                                         �                                                                          �
                                         � F1=Help                 F2=Refresh              F3=Cancel                �
F1=Help                                F2� F7=Select               F8=Image                F10=Exit                 � F4=List
F5=Reset                               F6� Enter=Do                /=Find                  n=Find Next              � F8=Image


Now the trace wil start and you should see the file "/tmp/trcraw" grow in size.
You can see that with:

$ ls -al /tmp/trcraw

Also, try this command from the prompt:

$ ps -ef | grep trace

and you should see your trace running in the process list.

IMPORTANT:

Did you note, that we did not select a PID (process ID) to trace on? So, actually, we trace on (almost) all processes,
"which do something" on the eventgroups we selected.

Ofcourse, if you know a PID on which you want to trace, you just fill that in the menu shown in Fig. 2.

If you select to trace on a PID (only), the your tracefile will ofcourse not grow that fast, as it would in our example.

But even in our example (where we trace on all processes on the selected eventgroups), we can see marvelous things.
Suppose Oracle and/or Websphere, or monitoring tools, (or you name it), are running. Later on, when you inspect the tracefile,
you can find very valuable information about what those processes do "under the hood".


Remember, we are creating a raw trace file here. We still need to do one extra step, after stopping the trace.


>>> Stop the trace and create a readable file:
----------------------------------------------


Ok, if you have left smitty, start it up again.

$ smitty trace

In the menu that follows, just select " STOP Trace".

  START Trace
  STOP Trace
  Generate a Trace Report
  Manage Trace
  Manage Event Groups

and the trace facility will stop tracing.

Next, we want to have a readable file, which we can view (use cat, pg, more, grep etc..).
In smitty, there are options available to create a trace report, but I think it's more instructive
to do this from the prompt. Here we go:

We have a raw trace in the file /tmp/trcraw
Lets create a readable file from the raw file, and call it "/tmp/trctxt".

You can do that with for example:

$ trcrpt -O pid=on,exec=on trcraw > trcnew


Please be aware that the textfile is typically 2 or 3 times larger than the raw file. So, always be aware on available
space in the filesystem where you want to create the file.

Now you can open the file, or grep it on an identifier etc..


>>> Some important subroutines, or how to "read" the trace:
-----------------------------------------------------------


If you inspect your trace with cat, vi, or whatever tool, it's rather full with what seems 
many cryptic messages, like

101 ksh            1175648       4.075516073       0.009374           fstatx LR = D0376824
104 ksh            1175648       4.075520760       0.004687           return from fstatx [5 usec]

104 ksh            1175648       4.074383326       0.001349           return from __loadx [1 usec]
101 ksh            1175648       4.074383929       0.000603           __loadx LR = D03B8288
104 ksh            1175648       4.074385155       0.001226           return from __loadx [1 usec]
101 ksh            1175648       4.074395702       0.010547           getuidx LR = D03BF94C
104 ksh            1175648       4.074396214       0.000512           return from getuidx [1 usec]


>>> statx, stat, lstat, fstatx, fstat, fullstat, ffullstat, stat64, lstat64, fstat64, stat64x, fstat64x, or lstat64x Subroutine

Purpose
Provides information about a file or shared memory object.
Library
Standard C Library (libc.a)


>>> vnop_open Entry Point

Purpose
Requests that a file be opened for reading or writing.
0 Indicates success. 
Nonzero return values are returned from the /usr/include/sys/errno.h file to indicate failure.


>>> getuid, geteuid, or getuidx Subroutine

Purpose
Gets the real or effective user ID of the current process.
Library
Standard C Library (libc.a)


4.2. A few examples of using the truss command:
===============================================


With "truss" you can trace a command, or trace an existing process. It shows all system calls (or a selection) made, with their arguments
and the return code. System call parameters are displayed symbolically. 
It also prints information about all signals received by a process.

You can use truss in the following way:

# truss [options] command

You must understand that in this way, you actually start the command, and let truss attach, and then it will 
display the calls to the system and external libaries.

# truss [options] -p PID

In this case, you 'attach' to an existing process.

There are many parameters (or options) you can use, but a few of the most important options are:

-o truss.log		# So here you save the truss trace to the logfile "truss.log"
-t [!] Syscall		# If you leave out -t, you trace on all syscalls. Indeed, the default is "-tall".
                        # If you use -t, you can also give a comma seperated list on the calls you want to
                        # trace on, like "-t open,statx,close", where you will only trace on open, close, statx.
                        # You can also excluse certain syscalls, by using "-t ! syscall".
-u [!] [LibraryName]    # Here you can give a comma seperated list on which you want to trace the calls to.
                        # using -u ! LibraryName, you can exclude a certain library from the trace.


let's take a look at a few simple examples:

Example 1:
----------

Suppose in /opt/app/cc we have a program called "test".
Somebody from your group tries to run it, but it immediately dies, and you don't have a clue to what caused it.
It was supposed to present colleque a menuscreen to work with, but that never happened.

Ofcourse, any well behaved program should give a messsage on the screen, or write
status information in a logile.
But suppose we are dealing with a program without those nice features.

$ ./test

And it dies, while we were expecting a menuscreen to work with.
Why did it die?


Let's try truss:


$ truss ./test
execve("test", 0xFFBFFDEC, 0xFFBFFDF4)  argc = 1
getcwd("/home/albert", 1015)               = 0
stat("/home/albert/test", 0xFFBFFBC8)   = 0
open("/var/ld/ld.config", O_RDONLY)             Err#2 ENOENT
stat("/opt/csw/lib/libc.so.1", 0xFFBFF6F8)      Err#2 ENOENT
stat("/lib/libc.so.1", 0xFFBFF6F8)              = 0
resolvepath("/lib/libc.so.1", "/lib/libc.so.1", 1023) = 14
open("/lib/libc.so.1", O_RDONLY)                = 3
memcntl(0xFF280000, 139692, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
close(3)                                        = 0
getcontext(0xFFBFF8C0)
getrlimit(RLIMIT_STACK, 0xFFBFF8A0)             = 0
getpid()                                        = 7895 [7894]
setustack(0xFF3A2088)
open("/opt/app/etc/cc.conf", O_RDONLY)          Err#13 EACCES [file_dac_read]     <--- !!!
ioctl(1, TCGETA, 0xFFBFEF14)                    = 0


Now note the line that I have marked with "!!!". Here you see Err#13 EACCES.

From the lists in Section 1, we can find that Error 13 corresponds to "Permission denied".

So, suppose that you go to "/opt/app/etc/" and check the permissions on the file "cc.conf", you would find
that the permission on that file should be altered.
After using the following command: 
$ chmod g+r cc.conf                    # here we give the group read permission on "cc.conf"

Now the program runs without errors. Probably this was a program that first wanted to read configuration information
from "/opt/app/etc/cc.conf", and if that fails, the program would just terminate without any message.
Ofcourse, that program could have been designed much better. 
But we have seen an example where truss was of use.


Example 2:
----------

Let's run the program "lsps -s" (show pagingspace) from my home dir, and let's truss it, to see what systemcalls it makes:

albert@sharky:/home/albert $ truss lsps -s

execve("/usr/sbin/lsps", 0x2FF22A4C, 0x2000EB28)  argc: 2
__loadx(0x03000000, 0x2FF22870, 0x000000F0, 0x10000000, 0x20000E14) = 0x00000000
__loadx(0x0A040000, 0xD0572CD4, 0x0000000A, 0x00000000, 0x00000000) = 0x00000000
sbrk(0x00000000)                                = 0x20004570
vmgetinfo(0x2FF21C30, 7, 16)                    = 0
sbrk(0x00000000)                                = 0x20004570
__libc_sbrk(0x00000000)                         = 0x20004570
getuidx(4)                                      = 6318
getuidx(2)                                      = 6318
getuidx(1)                                      = 6318
getgidx(4)                                      = 1105
getgidx(2)                                      = 1105
getgidx(1)                                      = 1105
__loadx(0x01000080, 0x2FF216E0, 0x00000960, 0x2FF22160, 0x00000000) = 0xD0149130
__loadx(0x0A040000, 0xD0572CA0, 0x2FF22FFC, 0x0000D0B2, 0x00000000) = 0x00000000
__loadx(0x01000180, 0x2FF216E0, 0x00000960, 0xF028CC4C, 0xF028CB7C) = 0xF03358D8
__loadx(0x0A040000, 0xD0572CA0, 0x2FF22FFC, 0x0000D0B2, 0x00000000) = 0x00000000
__loadx(0x07080000, 0xF028CC1C, 0xFFFFFFFF, 0xF03358D8, 0x00000000) = 0xF0336808
__loadx(0x07080000, 0xF028CB5C, 0xFFFFFFFF, 0xF03358D8, 0x00000000) = 0xF0336814
__loadx(0x07080000, 0xF028CC2C, 0xFFFFFFFF, 0xF03358D8, 0x00000000) = 0xF0336844
__loadx(0x07080000, 0xF028CB6C, 0xFFFFFFFF, 0xF03358D8, 0x00000000) = 0xF0336850
__loadx(0x07080000, 0xF028CBEC, 0xFFFFFFFF, 0xF03358D8, 0x00000000) = 0xF0336820
__loadx(0x07080000, 0xF028CB8C, 0xFFFFFFFF, 0xF03358D8, 0x00000000) = 0xF0336838
__loadx(0x07080000, 0xF028CBFC, 0xFFFFFFFF, 0xF03358D8, 0x00000000) = 0xF033685C
__loadx(0x07080000, 0xF028CC0C, 0xFFFFFFFF, 0xF03358D8, 0x00000000) = 0xF033688C
__loadx(0x07080000, 0xF028CB9C, 0xFFFFFFFF, 0xF03358D8, 0x00000000) = 0xF0336874
__loadx(0x07080000, 0xF028CBAC, 0xFFFFFFFF, 0xF03358D8, 0x00000000) = 0xF0336910
getuidx(4)                                      = 6318
getuidx(2)                                      = 6318
getuidx(1)                                      = 6318
getgidx(4)                                      = 1105
getgidx(2)                                      = 1105
getgidx(1)                                      = 1105
__loadx(0x01000080, 0x2FF216E0, 0x00000960, 0x2FF22160, 0x00000000) = 0xD0149130
getuidx(4)                                      = 6318
getuidx(2)                                      = 6318
getuidx(1)                                      = 6318
getgidx(4)                                      = 1105
getgidx(2)                                      = 1105
getgidx(1)                                      = 1105
__loadx(0x01000080, 0x2FF216E0, 0x00000960, 0x2FF22160, 0x00000000) = 0xD0149130
getuidx(4)                                      = 6318
getuidx(2)                                      = 6318
getuidx(1)                                      = 6318
getgidx(4)                                      = 1105
getgidx(2)                                      = 1105
getgidx(1)                                      = 1105
__loadx(0x01000080, 0x2FF216E0, 0x00000960, 0x2FF22160, 0x00000000) = 0xD0149130
getuidx(4)                                      = 6318
getuidx(2)                                      = 6318
getuidx(1)                                      = 6318
getgidx(4)                                      = 1105
getgidx(2)                                      = 1105
getgidx(1)                                      = 1105
__loadx(0x01000080, 0x2FF216E0, 0x00000960, 0x2FF22160, 0x00000000) = 0xD0149130
getuidx(4)                                      = 6318
getuidx(2)                                      = 6318
getuidx(1)                                      = 6318
getgidx(4)                                      = 1105
getgidx(2)                                      = 1105
getgidx(1)                                      = 1105
__loadx(0x01000080, 0x2FF216E0, 0x00000960, 0x2FF22160, 0x00000000) = 0xD0149130
access("/usr/lib/nls/msg/en_US/cmdps.cat", 0)   = 0
_getpid()                                       = 483490
psdanger(0)                                     = 524288
psdanger(-1)                                    = 521468
open("/usr/lib/nls/msg/en_US/cmdps.cat", O_RDONLY) = 3
kioctl(3, 22528, 0x00000000, 0x00000000)        Err#25 ENOTTY
kfcntl(3, F_SETFD, 0x00000001)                  = 0
kioctl(3, 22528, 0x00000000, 0x00000000)        Err#25 ENOTTY
kread(3, "\0\001 �\001\001 I S O 8".., 4096)    = 4096
lseek(3, 0, 1)                                  = 4096
lseek(3, 0, 1)                                  = 4096
lseek(3, 0, 1)                                  = 4096
_getpid()                                       = 483490
lseek(3, 0, 1)                                  = 4096
kioctl(1, 22528, 0x00000000, 0x00000000)        = 0
Total Paging Space   Percent Used
kwrite(1, " T o t a l   P a g i n g".., 34)     = 34
      2048MB               1%
kwrite(1, "             2 0 4 8 M B".., 30)     = 30
__loadx(0x04000000, 0x2FF22080, 0x00000800, 0x0000D0B2, 0x00000000) = 0x00000000
kfcntl(1, F_GETFL, 0x00000001)                  = 67110914
kfcntl(2, F_GETFL, 0xF02DF418)                  = 67110914
_exit(0)


There is a lot of output on the screen. I entered "lsps -s", and truss will watch what syscalls are done
and shows that on your screen.
In fact, many of the first lines deal with "getuidx" and that kind of calls. The system would like to know
who (and in what groups he/she is) issued the command.
You can ignore the output, because it's not that interresting. I only "published" it here, to give you an
idea on how much output those tracing commands (like truss) generates.


If I just want to store that information to a logfile, for example "truss.log", I would use the following command:

albert@sharky:/home/albert $ truss -o truss.log lsps -s


4.3. Other possible usefull AIX commands:
=========================================

Although not directly related to tracing, the following limited list of commands might help in creating a better view of
your system and processes. I am sure you are familiair with them, but let's list them anyway::


-- Show your AIX version:

# oslevel -r
# oslevel -s		# with SP, TL

-- Show the jobs that are scheduled (in the account you use) from cron:

# crontab -l

-- What are the standard mounted filesystems?: That's defined in "/etc/filesystems"

# cat /etc/filesystems | more

-- Which processes are using a certain filesystem?

# fuser -c /filesystem     # We mean the "mountpoint", like for example /appl/oracle

-- Show memory usage of a process:

# procmap pid              # pid is the process-id, as visible in the command "ps -ef"   

-- Show the open files that a process uses:

# pfiles pid               # also take a look at the "lsof" command: man lsof            

-- Show system memory:

# bootinfo -r
# lsattr -E -l mem0
# lsattr -E -l sys0 -a realmem
# svmon -G
# vmstat -v
# vmo -L                # ( lots of output )
# svmon -U -g -t 10     # ( top 10 users paging space)

-- Swap usage:

# lsps -s                 # more than 60%-70% used? It get's really scary. More than 75% used? Oh boy!
# pstat -s

-- cpu info:

# lparstat (-i)       
# prtconf | grep proc
# pmcycles -m
# lscfg | grep proc
# pstat -S

-- ulimit:

Sometimes, when a process runs under some ones credentials, and it fails for no immediate reason, it might be
worth checking the "ulimit" of that account (like max filesize, max open files, number of files etc..)
use it under that account as:

# ulimit -a

-- Show process tree of parent and children:

# proctree pid        # Tip: take a look at the "proc tools" on AIX               


-- Show the system error report / error log:

# errpt                           # or "errpt | more" 
# errpt -aj <ERRID> | more        # view details of an error record. ERRID is the 1st identifier in such a record.

-- Determine the type of a file (e.g. is it ascii, or another type of file?)

# file file_name          # (yes..., the command is really "file")

-- Show free/used space of the filesystems:

# df -m         # m in MB; k in KB; g in GB

If there are many filesystems, you might want to see just the top 5 that have the lowest on free space:

# df -k |awk '{print $4,$7}' |grep -v "Filesystem" | sort -n | tail -5


-- How to become another user, or possibly root:

# su - accountname       # (switch to that accountname like "su - albert")
# su -                   # (switch to root)
                         # if the sudo utility is implemented, you might try the command "sudo -l" to see what you might execute.

-- Carefull!! How to kill a process "the hard way"?

# kill -9 PID              # carefull, don't kill the wrong one; not recommended unless you don't have a choice.

-- Carefull!! How to kill all your processes "the hard way", all at once?

# kill -9 -1               # be very carefull; not recommended unless you don't have a choice.
# killall                  # be very carefull; not recommended unless you don't have a choice.


-- Show your uid (userid) and gid (groupid):

# id

-- refresh inetd after modifying "/etc/inetd.conf":

# refresh -s inetd

-- Show the last logged on users + date (history since last restart):

# last | more

-- To show the init runlevel:

# who -r 

-- Show uptime of system plus average load (15 minutes):

# uptime

-- Clean memory with ipcrm (be carefull):

# ipcrm -m 50855977      # (clear memory segment, identfied by example id 50855977; Be carefull)
# ipcrm -s 2228248       # (remove semaphore, identfied by example id 2228248; Be carefull) 
# ipcrm -q 5111883       # (remove queue, identfied by example id 5111883; Be carefull) )
                         # (see man pages ipcrm)

-- To clear out unused system modules (currently unused modules in kernel and library memory):

# slibclean


============================================================================
5. Solaris:
============================================================================

A similar "story" will be put here, but then ofcourse for Solaris.

 
============================================================================
6. Other:
============================================================================


6.1 Some trivial remarks:
=========================

(1):
====

Now for some really really trivial remarks......   (Sorry !)

- kernel parameters

If you have problems installing a program, or if fails to run properly, are you sure all
required kernel parameters have been set? 

- Environment variables

If you have problems installing a program, or if fails to run properly, are you sure all
required Environment variables have been set? 
Many "large" programs really have an impressive list of variables you need to set in place
before it will run properly.

- Dependencies on other stuff.

Most (commercial) programs depend heavily on installed support programs or tools, like perl, java,  etc..
They may even have very strict requirements on versions of those support programs.

- Cluttered memory (ipc identifiers, semaphores, shared memory)

If you have started an application, and terminated it roughly, it's possible that
"stuff" still remains in memory. 
In such a case, it's possible that your app will not be able to restart.
You need to use a tool like "ipcrm" to clean memory, or
you might even consider to reboot the system.


(2):
====

For HPUX 11i, a trace tool called "tusc" is available. You need to download it from HP and install it.
The way to use it is very similar to the tools we have seen above. There even exists a "truss" wrap around it, so 
you can use it like truss as we have seen above.