100% found this document useful (115 votes)
14K views1,157 pages

Unix/Linux Notes

Uploaded by

vrbala
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, TXT or read online on Scribd
100% found this document useful (115 votes)
14K views1,157 pages

Unix/Linux Notes

Uploaded by

vrbala
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, TXT or read online on Scribd
You are on page 1/ 1157

/****************************************************************************/

/* Document : UNIX command examples, mainly based on Solaris, AIX, HP */


/* and ofcourse, also Linux. */
/* Doc. Version : 102 */
/* File : unix.txt */
/* Purpose : some usefull examples for the Oracle, DB2, SQLServer DBA */
/* Date : 10-03-2008 */
/* Compiled by : Albert van der Sel */
/* Best use : Use find/search in your editor to find a string, command, */
/* or any identifier */
/****************************************************************************/

#####################################
SECTION 1. COMMANDS AND ARCHITECTURE:
#####################################

==========================
1. HOW TO GET SYSTEM INFO:
==========================

1.1 Short version:


==================

See section 1.2 for more detailed commands and options.

Memory:
-------
AIX: bootinfo -r
lsattr -E -lmem0
/usr/sbin/lsattr -E -l sys0 -a realmem
or use a tool as "topas" or "nmon" (these are utilities)
Linux: cat /proc/meminfo
/usr/sbin/dmesg | grep "Physical"
free (the free command)
HP: /usr/sam/lbin/getmem
grep MemTotal /proc/meminfo
/etc/dmesg | grep -i phys
wc -c /dev/mem
or us a tool as "glance", like entering "glance -m" from prompt (is a
utility)
Solaris: /usr/sbin/prtconf | grep "Memory size"
Tru64: /bin/vmstat -P | grep "Total Physical Memory"

Swap:
-----

AIX: /usr/sbin/lsps -a
HP: /usr/sbin/swapinfo -a
Solaris: /usr/sbin/swap -l
Linux: /sbin/swapon -s
cat /proc/swaps
cat /proc/meminfo

OS version:
-----------

HP: uname -a
Linux: cat /proc/version
Solaris: uname -a
Tru64: /usr/sbin/sizer -v
AIX: oslevel -r
lslpp -h bos.rte

AIX firmware:
lsmcode -c display the system firmware level and service processor
lsmcode -r -d scraid0 display the adapter microcode levels for a RAID adapter
scraid0
lsmcode -A display the microcode level for all supported devices
prtconf shows many setting including memory, firmware, serial#
etc..

cpu:
----

HP: ioscan -kfnC processor


getconf CPU_VERSION
getconf CPU_CHIP_TYPE
model

AIX: prtconf | grep proc


pmcycles -m
lsattr -El procx (x is 0,2, etc..)
lscfg | grep proc

Linux: cat /proc/cpuinfo

Solaris: psrinfo -v
prtconf

Notes about lpars:


------------------

For AIX: The uname -L command identifies a partition on a system with multiple
LPARS. The LPAR id
can be useful for writing shell scripts that customize system settings such as
IP address or hostname.

The output of the command looks like:

# uname -L
1 lpar01

The output of uname -L varies by maintenance level. For consistent output across
maintenance levels,
add a -s flag. For illustrate, the following command assigns the partition
number to the variable
"lpar_number" and partiton name to "lpar_name".

For HP-UX:
Use commands like "parstatus" or "getconf PARTITION_IDENT" to get npar
information.

patches:
--------

AIX: Is a certain fix (APAR) installed?


instfix -ik APAR_number
instfix -a -ivk APAR_number

To determine your platform firmware level, at the command prompt, type:

lscfg -vp | grep -p Platform

The last six digits of the ROM level represent the platform firmware date
in the format, YYMMDD.

HP: /usr/sbin/swlist -l patch


swlist | grep patch
Linux: rpm -qa
Solaris: showrev -p
pkginfo -i package_name
Tru64: /usr/sbin/dupatch -track -type kit

Netcards:
---------

AIX: lsdev -Cc adapter


lsdev -Cc adapter | grep ent
lsdev -Cc if
lsattr -E -l ent1
ifconfig -a
Solaris: prtconf -D / prtconf -pv / prtconf | grep "card"
prtdiag | grep "card"
svcs -x
ifconfig -a (up plumb)

1.2 More Detail:


================

1.2.1 Show memory in Solaris:


=============================

prtconf:
--------
Use this command to obtain detailed system information about your Sun Solaris
installation
# /usr/sbin/prtconf
# prtconf -v
Displays the size of the system memory and reports information about peripheral
devices

Use this command to see the amount of memory:


# /usr/sbin/prtconf | grep "Mem"

sysdef -i reports on several system resource limits. Other parameters can be


checked on a running system
using adb -k :

# adb -k /dev/ksyms /dev/mem


parameter-name/D
^D (to exit)

1.2.2 Show memory in AIX:


=========================

>> Show Total memory:


--------=====--------

# bootinfo -r
# lsattr -El sys0 -a realmem
# prtconf (you can grep it on memory)

>> Show Details of memory:


--------------------------

You can have a more detailed and comprehensive look at AIX memory by using "vmstat
-v" and "vmo -L" or "vmo -a":

For example:

# vmstat -v
524288 memory pages
493252 lruable pages
67384 free pages
7 memory pools
131820 pinned pages
80.0 maxpin percentage
20.0 minperm percentage
80.0 maxperm percentage
25.4 numperm percentage
125727 file pages
0.0 compressed percentage
0 compressed pages
25.4 numclient percentage
80.0 maxclient percentage
125575 client pages
0 remote pageouts scheduled
14557 pending disk I/Os blocked with no pbuf
6526890 paging space I/Os blocked with no psbuf
18631 filesystem I/Os blocked with no fsbuf
0 client filesystem I/Os blocked with no fsbuf
49038 external pager filesystem I/Os blocked with no fsbuf
0 Virtualized Partition Memory Page Faults
0.00 Time resolving virtualized partition memory page faults

The vmo command really gives lots of output. In the following example only a small
fraction of the output is shown:

# vmo -L

..
lrubucket 128K 128K 128K 64K 4KB pages D
--------------------------------------------------------------------------------
maxclient% 80 80 80 1 100 % memory D
maxperm%
minperm%
--------------------------------------------------------------------------------
maxfree 1088 1088 1088 8 200K 4KB pages D
minfree
memory_frames
--------------------------------------------------------------------------------
maxperm 394596 394596 S
--------------------------------------------------------------------------------
maxperm% 80 80 80 1 100 % memory D
minperm%
maxclient%
--------------------------------------------------------------------------------
maxpin 424179 424179 S
..
..

>> To further look at your virtual memory and its causes, you can use a
combination of:
----------------------------------------------------------------------------------
-----

# ipcs -bm (shared memory)


# lsps -a (paging)
# vmo -a or vmo -L (virtual memory options)
# svmon -G (basic memory allocations)
# svmon -U (virtual memory usage by user)

To print out the memory usage statistics for the users root and steve
taking into account only working segments, type:

svmon -U root steve -w

To print out the top 10 users of the paging space, type:

svmon -U -g -t 10

To print out the memory usage statistics for the user steve, including the
list of the process identifiers, type:

svmon -U steve -l
svmon -U emcdm -l
Note: sysdumpdev -e
Although the sysdumpdev command is used to show or alter the dumpdevice for a
system dump,
you can also use it to show how much real memory is used.

The command
# sysdumpdev -e
provides an estimated dump size taking into account the current memory (not
pagingspace) currently
in use by the system.

Note: the rmss command:

The rmss (Reduced-Memory System Simulator) command is used to ascertain the


effects of reducing the amount
of available memory on a system without the need to physically remove memory from
the system. It is useful
for system sizing, as you can install more memory than is required and then use
rmss to reduce it.
Using other performance tools, the effects of the reduced memory can be monitored.
The rmss command has
the ability to run a command multiple times using different simulated memory sizes
and produce statistics
for all of those memory sizes.

The rmss command resides in /usr/bin and is part of the bos.perf.tools fileset,
which is installable
from the AIX base installation media.

Syntax rmss -p -c <MB> -r


Options
-p Print the current value
-c MB Change to M size (in Mbytes)
-r Restore all memory to use
-p Print the current value

Example: find out how much memory you have online


rmss -p
Example: Change available memory to 256 Mbytes
rmss -c 256
Example: Undo the above
rmss -r

Warning:

rmss can damage performance very seriously


Don't go below 25% of the machines memory
Never forget to finish with rmss -r

1.2.3 Show memory in Linux:


===========================

# /usr/sbin/dmesg | grep "Physical:"


# cat /proc/meminfo
The ipcs, vmstat, iostat and that type of commands, are ofcourse more or less the
same
in Linux as they are in Solaris or AIX.

1.2.4 Show aioservers in AIX:


=============================

# lsattr -El aio0


autoconfig available STATE to be configured at system restart True
fastpath enable State of fast path True
kprocprio 39 Server PRIORITY True
maxreqs 4096 Maximum number of REQUESTS True
maxservers 10 MAXIMUM number of servers per cpu True
minservers 1 MINIMUM number of servers True

# pstat -a | grep -c aios


20

# ps -k | grep aioserver
331962 - 0:15 aioserver
352478 - 0:14 aioserver
450644 - 0:12 aioserver
454908 - 0:10 aioserver
565292 - 0:11 aioserver
569378 - 0:10 aioserver
581660 - 0:11 aioserver
585758 - 0:17 aioserver
589856 - 0:12 aioserver
593954 - 0:15 aioserver
598052 - 0:17 aioserver
602150 - 0:12 aioserver
606248 - 0:13 aioserver
827642 - 0:14 aioserver
991288 - 0:14 aioserver
995388 - 0:11 aioserver
1007616 - 0:12 aioserver
1011766 - 0:13 aioserver
1028096 - 0:13 aioserver
1032212 - 0:13 aioserver

What are aioservers in AIX5?:

With IO on filesystems, for example if a database is involved, you may try to tune
the number
of aioservers (asynchronous IO)

AIX 5L supports asynchronous I/O (AIO) for database files created both on file
system partitions and on raw devices.
AIO on raw devices is implemented fully into the AIX kernel, and does not require
database processes
to service the AIO requests. When using AIO on file systems, the kernel database
processes (aioserver)
control each request from the time a request is taken off the queue until it
completes. The kernel database
processes are also used with I/O with virtual shared disks (VSDs) and HSDs with
FastPath disabled. By default,
FastPath is enabled. The number of aioserver servers determines the number of AIO
requests that can be executed
in the system concurrently, so it is important to tune the number of aioserver
processes when using file systems
to store Oracle Database data files.

- Use one of the following commands to set the number of servers. This applies
only when using asynchronous I/O
on file systems rather than raw devices:

# smit aio

# chdev -P -l aio0 -a maxservers='128' -a minservers='20'

- To set asynchronous IO to �Available�:


# chdev -l aio0 -P -a autoconfig=available

You need to restart the Server:


# shutdown -Fr

1.2.5 aio on Linux distro's:


============================

On some Linux distro's, Oracle 9i/10g supports asynchronous I/O but it is disabled
by default because
some Linux distributions do not have libaio by default. For Solaris, the following
configuration is not required
- skip down to the section on enabling asynchronous I/O.

On Linux, the Oracle binary needs to be relinked to enable asynchronous I/O. The
first thing to do is shutdown
the Oracle server. After Oracle has shutdown, do the following steps to relink the
binary:

su - oracle
cd $ORACLE_HOME/rdbms/lib
make -f ins_rdbms.mk async_on
make -f ins_rdbms.mk ioracle

1.2.6 The ipcs and ipcrm commands:


==================================

The "ipcs" command is really a "listing" command. But if you need to intervene
in memory structures, like for example if you need to "clear" or remove a shared
memory segment,
because a faulty or crashed
application left semaphores, memory identifiers, or queues in place,
you can use to "ipcrm" command to remove those structures.

Example ipcrm command usage:


----------------------------

Suppose an application crashed, but it cannot be started again. The following


might help,
if you happened to know which IPC identifier it used.
Suppose the app used 47500 as the IPC key. Calcultate this decimal number to hex
which is, in this example, B98C.

No do the following:

# ipcs -bm | grep B89C

This might give you, for example, the shared memory identifier "50855977".
Now clear the segment:

# ipcrm -m 50855977

It might also be, that still a semaphore and/or queue is still "left over".
In that case you might also try commands like the following example:

ipcs -q
ipcs -s

# ipcrm -s 2228248 (remove semaphore)


# ipcrm -q 5111883 (remove queue)

Note: in some cases the "slibclean" command can be used to clear unused modules in
kernel and library memory.
Just give as root the command:

# slibclean

Other Example:
--------------

If you run the following command to remove a shared memory segment and you get
this error:

# ipcrm -m 65537
ipcrm: 0515-020 shmid(65537) was not found.

However, if you run the ipcs command, you still see the segment there:

# ipcs | grep 65537


m 65537 0x00000000 DCrw------- root system

If you look carefully, you will notice the "D" in the forth column. The "D" means:

D If the associated shared memory segment has been removed. It disappears when the
last process attached
to the segment detaches it.

So, to clear the shared memory segment, find the process which is still associated
with the segment:

# ps -ef | grep process_owner

where process_owner is the name of the owner using the shared segment

Now kill the process found from the ps command above


# kill -9 pid

Running another ipcs command will show the shared memory segment no longer exists:

# ipcs | grep 65537


Example

ipcrm -m 65537

1.2.7 Show patches, version, systeminfo:


========================================

Solaris:
========

showrev:
--------

#showrev
Displays system summary information.

#showrev -p
Reports which patches are installed

sysdef and dmesg:


-----------------

The follwing commands also displays configuration information


# sysdef
# dmesg

versions:
---------

==> To check your Solaris version:


# uname -a or uname -m
# cat /etc/release
# isainfo -v

==> To check your AIX version:

# oslevel
# oslevel -r tells you which maintenance level you have.

>> To find the known recommended maintenance levels:


# oslevel -rq

>> To find all filesets lower than a certain maintenance level:


# oslevel -rl 5200-06

>> To find all filesets higher than a certain maintenance level:


# oslevel -rg 5200-05

>> To list all known recommended maintenance and technology levels on the system,
type:

# oslevel -q -s
Known Service Packs
-------------------
5300-05-04
5300-05-03
5300-05-02
5300-05-01
5300-05-00
5300-04-CSP
5300-04-03
5300-04-02
5300-04-01
5300-03-CSP

>> How can I determine which fileset updates are missing from a particular AIX
level?
To determine which fileset updates are missing from 5300-04, for example, run the
following command:

# oslevel -rl 5300-04

>> What SP (Service Pack) is installed on my system?


To see which SP is currently installed on the system, run the oslevel -s command.
Sample output for an
AIX 5L Version 5.3 system, with TL4, and SP2 installed would be:

# oslevel �s
5300-04-02

>> Is a CSP (Concluding Service Pack) installed on my system?


To see if a CSP is currently installed on the system, run the oslevel -s command.
Sample output for an AIX 5L Version 5.3 system, with TL3, and CSP installed would
be:

# oslevel �s
5300-03-CSP

==> To check your HP machine:

# model
9000/800/rp7410

: machine info on AIX

How do I find out the Chip type, System name, Node name, Model Number etc.?

The uname command provides details about your system. uname -p Displays the chip
type of the system.
For example, powerpc.

uname -r Displays the release number of the operating system.


uname -s Displays the system name. For example, AIX.
uname -n Displays the name of the node.
uname -a Displays the system name, nodename,Version, Machine id.
uname -M Displays the system model name. For example, IBM, 7046-B50.
uname -v Displays the operating system version
uname -m Displays the machine ID number of the hardware running the system.
uname -u Displays the system ID number.

Architecture:
-------------

To see if you have a CHRP machine, log into the machine as the root user, and run
the following command:

# lscfg | grep Architecture or use:


# lscfg -pl sysplanar0 | more

The bootinfo -p command also shows the architecture of the pSeries, RS/6000

# bootinfo -p
chrp

1.2.8 Check whether you have a 32 bit or 64 bit version:


========================================================

- Solaris:

# iasinfo -vk

If /usr/bin/isainfo cannot be found, then the OS only


supports 32-bit process address spaces. (Solaris 7
was the first version that could run 64-bit binaries
on certain SPARC-based systems.)
So a ksh-based test might look something like

if [ -x /usr/bin/isainfo ]; then
bits=`/usr/bin/isainfo -b`
else
bits=32
fi

- AIX:

Command: /bin/lslpp -l bos.64bit ...to see if bos.64bit is installed &


committed.
-or- /bin/locale64 ...error message if on 32bit machine
such as:
Could not load program
/bin/locale64:
Cannot run a 64-bit program on a
32-bit machine.

Or use:

# bootinfo -K displays the current kernel wordsize of "32" or "64"


# bootinfo -y tells if hardware is 64-bit capable
# bootinfo -p If it returns the string 32 it is only capable of running
the
32-bit kernel. If it returns the string chrp the machine is
capable of running the 64-bit kernel or the 32-bit kernel.
Or use:

# /usr/bin/getconf HARDWARE_BITMODE

This command should return the following output:

64

Note:
-----

HOW TO CHANGE KERNEL MODE OF IBM AIX 5L (5.1)


---------------------------------------------

The AIX 5L has pre-configured kernels. These are listed below for Power
processors:

/usr/lib/boot/unix_up 32 bit uni-processor


/usr/lib/boot/unix_mp 32 bit multi-processor kernel
/usr/lib/boot/unix_64 64 bit multi-processor kernel

Switching between kernel modes means using different kernels. This is simply
done by pointing the location that is referenced by the system to these kernels.
Use symbolic links for this purpose. During boot AIX system runs the kernel
in the following locations:

/unix
/usr/lib/boot/unix

The base operating system 64-bit runtime fileset is bos.64bit. Installing


bos.64bit also installs
the /etc/methods/cfg64 file. The /etc/methods/cfg64 file provides the option of
enabling or disabling
the 64-bit environment via SMIT, which updates the /etc/inittab file with the
load64bit line.
(Simply adding the load64bit line does not enable the 64-bit environment).

The command lslpp -l bos.64bit reveals if this fileset is installed. The


bos.64bit fileset
is on the AIX media; however, installing the bos.64bit fileset does not ensure
that you will be able
to run 64-bit software. If the bos.64bit fileset is installed on 32-bit
hardware, you should be able
to compile 64-bit software, but you cannot run 64-bit programs on 32-bit
hardware.

The syscalls64 extension must be loaded in order to run a 64-bit executable.


This is done from
the load64bit entry in the inittab file. You must load the syscalls64 extension
even when running
a 64-bit kernel on 64-bit hardware.

To determine if the 64-bit kernel extension is loaded, at the command line,


enter genkex |grep 64.
Information similar to the following displays:
149bf58 a3ec /usr/lib/drivers/syscalls64.ext

To change the kernel mode follow steps below:

1. Create symbolic link from /unix and /usr/lib/boot/unix to the location


of the desired kernel.
2. Create boot image.
3. Reboot AIX.

Below lists the detailed actions to change kernel mode:

To change to 32 bit uni-processor mode:

# ln -sf /usr/lib/boot/unix_up /unix


# ln -sf /usr/lib/boot/unix_up /usr/lib/boot/unix
# bosboot -ad /dev/ipldevice
# shutdown -r

To change to 32 bit multi-processor mode:

# ln -sf /usr/lib/boot/unix_mp /unix


# ln -sf /usr/lib/boot/unix_mp /usr/lib/boot/unix
# bosboot -ad /dev/ipldevice
# shutdown -r

To change to 64 bit multi-processor mode:

# ln -sf /usr/lib/boot/unix_64 /unix


# ln -sf /usr/lib/boot/unix_64 /usr/lib/boot/unix
# bosboot -ad /dev/ipldevice
# shutdown -r

IMPORTANT NOTE: If you are changing the kernel mode to 32-bit and you will run
9.2 on this server, the following line should be included in /etc/inittab:

load64bit:2:wait:/etc/methods/cfg64 >/dev/console 2>&1 # Enable 64-bit execs

This allows 64-bit applications to run on the 32-bit kernel. Note that this
line is also mandatory if you are using the 64-bit kernel.

In AIX 5.2, the 32-bit kernel is installed by default. The 64-bit kernel, along
with JFS2
(enhanced journaled file system), can be enabled at installation time.

Checking if other unixes are in 32 or 64 mode:


----------------------------------------------

- Digital UNIX/Tru64: This OS is only available in 64bit form.

- HP-UX(Available in 64bit starting with HP-UX 11.0):


Command: /bin/getconf KERNEL_BITS ...returns either 32 or 64

- SGI: This OS is only available in 64bit form.


- The remaining supported UNIX platforms are only available in 32bit form.

scinstall:
----------

# scinstall -pv
Displays Sun Cluster software release and package version information

1.2.9 Info about CPUs:


======================

Solaris:
--------

# psrinfo -v
Shows the number of processors and their status.

# psrinfo -v|grep "Status of processor"|wc -l


Shows number of cpu's

Linux:
------

# cat /proc/cpuinfo
# cat /proc/cpuinfo | grep processor|wc �l

Especially with Linux, the /proc directory contains special "files" that either
extract information from
or send information to the kernel

HP-UX:
------

# ioscan -kfnC processor


# /usr/sbin/ioscan -kf | grep processor
# grep processor /var/adm/syslog/syslog.log
# /usr/contrib/bin/machinfo (Itanium)

Several ways as,

1. sam -> performance monitor -> processor


2. print_manifest (if ignite-ux installed)
3. machinfo (11.23 HP versions)
4. ioscan -fnC processor
5. echo "processor_count/D" | adb /stand/vmunix /dev/kmem
6. top command to get cpu count

The "getconf" command can give you a lot of interesting info. The parameters are:

ARG_MAX _BC_BASE_MAX BC_DIM_MAX


BS_SCALE_MAX BC_STRING_MAX CHARCLASS_NAME_MAX
CHAR_BIT CHAR_MAX CHAR_MIN
CHILD_MAX CLK_TCK COLL_WEIGHTS_MAX
CPU_CHIP_TYPE CS_MACHINE_IDENT CS_PARTITION_IDENT
CS_PATH CS_MACHINE_SERIAL EXPR_NEST_MAX
HW_CPU_SUPP_BITS HW_32_64_CAPABLE INT_MAX
INT_MIN KERNEL_BITS LINE_MAX
LONG_BIT LONG_MAX LONG_MIN
MACHINE_IDENT MACHINE_MODEL MACHINE_SERIAL
MB_LEN_MAX NGROUPS_MAX NL_ARGMAX
NL_LANGMAX NL_MSGMAX NL_NMAX
NL_SETMAX NL_TEXTMAX NZERO
OPEN_MAX PARTITION_IDENT PATH
_POSIX_ARG_MAX _POSIX_JOB_CONTROL _POSIX_NGROUPS_MAX
_POSIX_OPEN_MAX _POSIX_SAVED_IDS _POSIX_SSIZE_MAX
_POSIX_STREAM_MAX _POSIX_TZNAME_MAX _POSIX_VERSION
POSIX_ARG_MAX POSIX_CHILD_MAX POSIX_JOB_CONTROL
POSIX_LINK_MAX POSIX_MAX_CANON POSIX_MAX_INPUT
POSIX_NAME_MAX POSIX_NGROUPS_MAX POSIX_OPEN_MAX
POSIX_PATH_MAX POSIX_PIPE_BUF POSIX_SAVED_IDS
POSIX_SSIZE_MAX POSIX_STREAM_MAX POSIX_TZNAME_MAX
POSIX_VERSION POSIX2_BC_BASE_MAX POSIX2_BC_DIM_MAX
POSIX2_BC_SCALE_MAX POSIX2_BC_STRING_MAX POSIX2_C_BIND
POSIX2_C_DEV POSIX2_C_VERSION POSIX2_CHAR_TERM
POSIX_CHILD_MAX POSIX2_COLL_WEIGHTS_MAX POSIX2_EXPR_NEST_MAX
POSIX2_FORT_DEV POSIX2_FORT_RUN POSIX2_LINE_MAX
POSIX2_LOCALEDEF POSIX2_RE_DUP_MAX POSIX2_SW_DEV
POSIX2_UPE POSIX2_VERSION SC_PASS_MAX
SC_XOPEN_VERSION SCHAR_MAX SCHAR_MIN
SHRT_MAX SHRT_MIN SSIZE_MAX

Example:

# getconf CPU_VERSION

sample function in shell script:

get_cpu_version()
{

case `getconf CPU_VERSION` in


# ???) echo "Itanium[TM] 2" ;;
768) echo "Itanium[TM] 1" ;;
532) echo "PA-RISC 2.0" ;;
529) echo "PA-RISC 1.2" ;;
528) echo "PA-RISC 1.1" ;;
523) echo "PA-RISC 1.0" ;;
*) return 1 ;;
esac
return 0

AIX:
----

# pmcycles -m
Cpu 0 runs at 1656 MHz
Cpu 1 runs at 1656 MHz
Cpu 2 runs at 1656 MHz
Cpu 3 runs at 1656 MHz
# lscfg | grep proc

More cpu information on AIX:

# lsattr -El procx (where x is the number of the cpu)


type powerPC_POWER5 Processor type False
frequency 165600000 Processor speed False
..
..
where False means that the value cannot be changed through an AIX command.

To view CPU scheduler tunable parameters, use the schedo command:

# schedo -a

In AIX 5L on Power5, you can switch from Simultaneous Multithreading SMT, or


Single Threading ST, as follows
(smtcl)
# smtctl -m off will set SMT mode to disabled
# smtctl -m on will set SMT mode to enabled
# smtctl -W boot makes SMT effective on next boot
# smtctl -W now effects SMT now, but will not persist across reboots

When you want to keep the setting across reboots, you must use the bosboot command
in order to create a new boot image.

1.2.10 Other stuff:


===================

runlevel:
---------
To show the init runlevel:
# who -r

Top users:
----------

To get a quick impression about the top 10 users in the system at this time:

ps auxw | sort �r +3 |head �10 -Shows top 10 memory usage by process


ps auxw | sort �r +2 |head �10 -Shows top 10 CPU usage by process

shared memory:
--------------
To check shared memory segment, semaphore array, and message queue limits, issue
the ipcs -l command.
# ipcs

The following tools are available for monitoring the performance of your UNIX-
based system.

pfiles:
-------
/usr/proc/bin/pfiles
This shows the open files for this process, which helps you diagnose whether you
are having problems
caused by files not getting closed.

lsof:
-----

This utility lists open files for running UNIX processes, like pfiles. However,
lsof gives more
useful information than pfiles. You can find lsof at
ftp://vic.cc.purdue.edu/pub/tools/unix/lsof/.

Example of lsof usage:

You can see CIO (concurrent IO) in the FILE-FLAG column if you run lsof +fg, e.g.:

tarunx01:/home/abielewi:# /p570build/LSOF/lsof-4.76/usr/local/bin/lsof +fg


/baanprd/oradat

COMMAND PID USER FD TYPE FILE-FLAG DEVICE


SIZE/OFF NODE NAME
oracle 434222 oracle 16u VREG R,W,CIO,DSYN,LG;CX 39,1
6701056 866 /baanprd/oradat (/dev/bprdoradat)
oracle 434222 oracle 17u VREG R,W,CIO,DSYN,LG;CX 39,1
6701056 867 /baanprd/oradat (/dev/bprdoradat)
oracle 442384 oracle 15u VREG R,W,CIO,DSYN,LG;CX 39,1
1174413312 875 /baanprd/oradat (/dev/bprdoradat)
oracle 442384 oracle 16u VREG R,W,CIO,DSYN,LG;CX 39,1
734011392 877 /baanprd/oradat (/dev/bprdoradat)
oracle 450814 oracle 15u VREG R,W,CIO,DSYN,LG;CX 39,1
1174413312 875 /baanprd/oradat (/dev/bprdoradat)
oracle 450814 oracle 16u VREG R,W,CIO,DSYN,LG;CX 39,1
1814044672 876 /baanprd/oradat (/dev/bprdoradat)
oracle 487666 oracle 15u VREG R,W,CIO,DSYN,LG;CX 39,1
1174413312 875 /baanprd/oradat (/dev/bprdoradat

You should also see O_CIO in your file open calls if you run truss,
e.g.:

open("/opt/oracle/rcat/oradat/redo01.log",
O_RDWR|O_CIO|O_DSYNC|O_LARGEFILE) = 18

VMSTAT SOLARIS:
---------------
# vmstat
This command is ideal for monitoring paging rate, which can be found under the
page in (pi) and page out (po) columns.
Other important columns are the amount of allocated virtual storage (avm) and free
virtual storage (fre).
This command is useful for determining if something is suspended or just taking a
long time.

Example:

kthr memory page disk faults cpu


r b w swap free re mf pi po fr de sr m0 m1 m3 m4 in sy cs us sy id
0 0 0 2163152 1716720 157 141 1179 1 1 0 0 0 0 0 0 680 1737 855 10 3 87
0 0 0 2119080 1729352 0 1 0 0 0 0 0 0 0 1 0 345 658 346 1 1 98
0 0 0 2118960 1729232 0 167 0 0 0 0 0 0 0 0 0 402 1710 812 4 2 94
0 0 0 2112992 1723264 0 1261 0 0 0 0 0 0 0 0 0 1026 5253 1848 10 5 85
0 0 0 2112088 1722352 0 248 0 0 0 0 0 0 0 0 0 505 2822 1177 5 2 92
0 0 0 2116288 1726544 4 80 0 0 0 0 0 0 0 0 0 817 4015 1530 6 4 90
0 0 0 2117744 1727960 4 2 30 0 0 0 0 0 0 0 0 473 1421 640 2 2 97

procs/r: Run queue length.


procs/b: Processes blocked while waiting for I/O.
procs/w: Idle processes which have been swapped.
memory/swap: Free, unreserved swap space (Kb).
memory/free: Free memory (Kb). (Note that this will grow until it reaches
lotsfree, at which point
the page scanner is started. See "Paging" for more details.)
page/re: Pages reclaimed from the free list. (If a page on the free list still
contains data needed
for a new request, it can be remapped.)
page/mf: Minor faults (page in memory, but not mapped). (If the page is still in
memory, a minor fault
remaps the page. It is comparable to the vflts value reported by sar -p.)

page/pi: Paged in from swap (Kb/s). (When a page is brought back from the swap
device, the process
will stop execution and wait. This may affect performance.)
page/po: Paged out to swap (Kb/s). (The page has been written and freed. This can
be the result of
activity by the pageout scanner, a file close, or fsflush.)
page/fr: Freed or destroyed (Kb/s). (This column reports the activity of the page
scanner.)
page/de: Freed after writes (Kb/s). (These pages have been freed due to a
pageout.)
page/sr: Scan rate (pages). Note that this number is not reported as a "rate," but
as a total number of pages scanned.
disk/s#: Disk activity for disk # (I/O's per second).
faults/in: Interrupts (per second).
faults/sy: System calls (per second).
faults/cs: Context switches (per second).
cpu/us: User CPU time (%).
cpu/sy: Kernel CPU time (%).
cpu/id: Idle + I/O wait CPU time (%).

When analyzing vmstat output, there are several metrics to which you should pay
attention. For example,
keep an eye on the CPU run queue column. The run queue should never exceed the
number of CPUs on the server.
If you do notice the run queue exceeding the amount of CPUs, it�s a good
indication that your server
has a CPU bottleneck.
To get an idea of the RAM usage on your server, watch the page in (pi) and page
out (po) columns
of vmstat�s output. By tracking common virtual memory operations such as page
outs, you can infer
the times that the Oracle database is performing a lot of work. Even though UNIX
page ins must correlate
with the vmstat�s refresh rate to accurately predict RAM swapping, plotting page
ins can tell you
when the server is having spikes of RAM usage.

Once captured, it's very easy to take the information about server performance
directly from the
Oracle tables and plot them in a trend graph. Rather than using an expensive
statistical package
such as SAS, you can use Microsoft Excel. Copy and paste the data from the tables
into Excel.
After that, you can use the Chart Wizard to create a line chart that will help you
view server
usage information and discover trends.

# VMSTAT AIX:
-------------

This is virtually equal to the usage of vmstat under solaris.

vmstat can be used to give multiple statistics on the system. For CPU-specific
work, try the following command:

# vmstat -t 1 3

This will take 3 samples, 1 second apart, with timestamps (-t). You can, of
course, change the parameters
as you like. The output is shown below.

kthr memory page faults cpu time


----- ----------- ------------------------ ------------ ----------- --------
r b avm fre re pi po fr sr cy in sy cs us sy id wa hr mi se
0 0 45483 221 0 0 0 0 1 0 224 326 362 24 7 69 0 15:10:22
0 0 45483 220 0 0 0 0 0 0 159 83 53 1 1 98 0 15:10:23
2 0 45483 220 0 0 0 0 0 0 145 115 46 0 9 90 1 15:10:24

In this output some of the things to watch for are:

"avm", which is Active Virtual Memory.


Ideally, under normal conditions, the largest avm value should in general be
smaller than the amount of RAM.
If avm is smaller than RAM, and still exessive paging occurs, that could be due to
RAM being filled
with file pages.

avm x 4K = number of bytes

Columns r (run queue) and b (blocked) start going up, especially above 10. This
usually is an indication
that you have too many processes competing for CPU.

If cs (contact switches) go very high compared to the number of processes, then


you may need to tune
the system with vmtune.

In the cpu section, us (user time) indicates the time is being spent in programs.
Assuming Java is
at the top of the list in tprof, then you need to tune the Java application).
In the cpu section, if sys (system time) is higher than expected, and you still
have id (idle) time left,
this may indicate lock contention. Check the tprof for lock related calls in the
kernel time. You may want
to try multiple instances of the JVM. It may also be possible to find deadlocks in
a javacore file.

In the cpu section, if wa (I/O wait) is high, this may indicate a disk bottleneck,
and you should use
iostat and other tools to look at the disk usage.

Values in the pi, po (page in/out) columns are non-zero may indicate that you are
paging and need more memory.
It may be possible that you have the stack size set too high for some of your JVM
instances.
It could also mean that you have allocated a heap larger than the amount of memory
on the system. Of course,
you may also have other applications using memory, or that file pages may be
taking up too much of the memory

Other example:
--------------

# vmstat 1

System configuration: lcpu=2 mem=3920MB

kthr memory page faults cpu


----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
0 0 229367 332745 0 0 0 0 0 0 3 198 69 0 0 99 0
0 0 229367 332745 0 0 0 0 0 0 3 33 66 0 0 99 0
0 0 229367 332745 0 0 0 0 0 0 2 33 68 0 0 99 0
0 0 229367 332745 0 0 0 0 0 0 80 306 100 0 1 97 1
0 0 229367 332745 0 0 0 0 0 0 1 20 68 0 0 99 0
0 0 229367 332745 0 0 0 0 0 0 2 36 64 0 0 99 0
0 0 229367 332745 0 0 0 0 0 0 2 33 66 0 0 99 0
0 0 229367 332745 0 0 0 0 0 0 2 21 66 0 0 99 0
0 0 229367 332745 0 0 0 0 0 0 1 237 64 0 0 99 0
0 0 229367 332745 0 0 0 0 0 0 2 19 66 0 0 99 0
0 0 229367 332745 0 0 0 0 0 0 6 37 76 0 0 99 0

The most important fields to look at here are:

r -- The average number of runnable kernel threads over whatever sampling interval
you have chosen.
b -- The average number of kernel threads that are in the virtual memory waiting
queue over your sampling interval. r should always be higher than b; if it is not,
it usually means you have a CPU bottleneck.
fre -- The size of your memory free list. Do not worry so much if the amount is
really small. More importantly, determine if there is any paging going on if this
amount is small.
pi -- Pages paged in from paging space.
po -- Pages paged out to paging space.
CPU section:
us
sy
id
wa

Let's look at the last section, which also comes up in most other CPU monitoring
tools, albeit with different headings:

us -- user time
sy -- system time
id -- idle time
wa -- waiting on I/O

# IOSTAT:
---------
This command is useful for monitoring I/O activities. You can use the read and
write rate to estimate the
amount of time required for certain SQL operations (if they are the only activity
on the system).
This command is also useful for determining if something is suspended or just
taking a long time.

Basic synctax is iostat <options> interval count

option - let you specify the device for which information is needed like disk ,
cpu or terminal. (-d , -c , -t or -tdc ) . x options gives the extended
statistics .

interval - is time period in seconds between two samples . iostat 4 will give
data at each 4 seconds interval.

count - is the number of times the data is needed . iostat 4 5 will give data at
4 seconds interval 5 times.

Example:

$ iostat -xtc 5 2
extended disk statistics tty cpu
disk r/s w/s Kr/s Kw/s wait actv svc_t %w %b tin tout us sy wt id
sd0 2.6 3.0 20.7 22.7 0.1 0.2 59.2 6 19 0 84 3 85 11 0
sd1 4.2 1.0 33.5 8.0 0.0 0.2 47.2 2 23
sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
sd3 10.2 1.6 51.4 12.8 0.1 0.3 31.2 3 31

disk name of the disk


r/s reads per second
w/s writes per second
Kr/s kilobytes read per second
Kw/s kilobytes written per second
wait average number of transactions waiting for service (Q length)
actv average number of transactions actively
being serviced (removed from the queue but not yet completed)
%w percent of time there are transactions waiting for service (queue non-
empty)
%b percent of time the disk is busy (transactions in progress)
The values to look from the iostat output are:

Reads/writes per second (r/s , w/s)


Percentage busy (%b)
Service time (svc_t)
If a disk shows consistently high reads/writes along with , the percentage busy
(%b) of the disks
is greater than 5 percent, and the average service time (svc_t) is greater than
30 milliseconds,
then action needs to be taken.

# netstat
This command lets you know the network traffic on each node, and the number of
error packets encountered.
It is useful for isolating network problems.

Example:

To find out all listening services, you can use the command

# netstat -a -f inet

1.2.11 Some other utilities for Solaris:


========================================

# top
For example:

load averages: 0.66, 0.54, 0.56 11:14:48


187 processes: 185 sleeping, 2 on cpu
CPU states: % idle, % user, % kernel, % iowait, % swap
Memory: 4096M real, 1984M free, 1902M swap in use, 2038M swap free

PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND
2795 oraclown 1 59 0 265M 226M sleep 0:13 4.38% oracle
2294 root 11 59 0 8616K 7672K sleep 10:54 3.94% bpbkar
13907 oraclown 11 59 0 271M 218M cpu2 4:02 2.23% oracle
14138 oraclown 12 59 0 270M 230M sleep 9:03 1.76% oracle
2797 oraclown 1 59 0 189M 151M sleep 0:01 0.96% oracle
2787 oraclown 11 59 0 191M 153M sleep 0:06 0.69% oracle
2799 oraclown 1 59 0 190M 151M sleep 0:02 0.45% oracle
2743 oraclown 11 59 0 191M 155M sleep 0:25 0.35% oracle
2011 oraclown 11 59 0 191M 149M sleep 2:50 0.27% oracle
2007 oraclown 11 59 0 191M 149M sleep 2:22 0.26% oracle
2009 oraclown 11 59 0 191M 149M sleep 1:54 0.20% oracle
2804 oraclown 1 51 0 1760K 1296K cpu2 0:00 0.19% top
2013 oraclown 11 59 0 191M 148M sleep 0:36 0.14% oracle
2035 oraclown 11 59 0 191M 149M sleep 2:44 0.13% oracle
114 root 10 59 0 5016K 4176K sleep 23:34 0.05% picld

Process ID
This column shows the process ID (pid) of each process. The process ID is a
positive number,
usually less than 65536. It is used for identification during the life of the
process.
Once a process has exited or been killed, the process ID can be reused.

Username
This column shows the name of the user who owns the process. The kernel stores
this information
as a uid, and top uses an appropriate table (/etc/passwd, NIS, or NIS+) to
translate this uid in to a name.

Threads
This column displays the number of threads for the current process. This column is
present only
in the Solaris 2 port of top.
For Solaris, this number is actually the number of lightweight processes (lwps)
created by the
threads package to handle the threads. Depending on current resource utilization,
there may not
be one lwp for every thread. Thus this number is actually less than or equal to
the total number
of threads created by the process.

Nice
This column reflects the "nice" setting of each process. A process's nice is
inhereted from its parent.
Most user processes run at a nice of 0, indicating normal priority. Users have the
option of starting
a process with a positive nice value to allow the system to reduce the priority
given to that process.
This is normally done for long-running cpu-bound jobs to keep them from
interfering with
interactive processes. The Unix command "nice" controls setting this value. Only
root can set
a nice value lower than the current value. Nice values can be negative. On most
systems they range from -20 to 20.
The nice value influences the priority value calculated by the Unix scheduler.

Size
This column shows the total amount of memory allocated by each process. This is
virtual memory
and is the sum total of the process's text area (program space), data area, and
dynamically
allocated area (or "break"). When a process allocates additional memory with the
system call "brk",
this value will increase. This is done indirectly by the C library function
"malloc".
The number in this column does not reflect the amount of physical memory currently
in use by the process.

Resident Memory
This column reflects the amount of physical memory currently allocated to each
process.
This is also known as the "resident set size" or RSS. A process can have a large
amount
of virtual memory allocated (as indicated by the SIZE column) but still be using
very little physical memory.

Process State
This column reflects the last observed state of each process. State names vary
from system to system.
These states are analagous to those that appear in the process states line: the
second line of the display.
The more common state names are listed below.
cpu - Assigned to a CPU and currently running
run - Currently able to run
sleep - Awaiting an external event, such as input from a device
stop - Stopped by a signal, as with control Z
swap - Virtual address space swapped out to disk
zomb - Exited, but parent has not called "wait" to receive the exit status

CPU Time
This column displayes the accumulated CPU time for each process. This is the
amount of time
that any cpu in the system has spent actually running this process. The standard
format shows
two digits indicating minutes, a colon, then two digits indicating seconds.
For example, the display "15:32" indicates fifteen minutes and thirty-two seconds.

When a time value is greater than or equal to 1000 minutes, it is displayed as


hours with the suffix H.
For example, the display "127.4H" indicates 127 hours plus four tenths of an hour
(24 minutes).
When the number of hours exceeds 999.9, the "H" suffix is dropped so that the
display
continues to fit in the column.

CPU Percentage
This column shows the percentage of the cpu that each process is currently
consuming.
By default, top will sort this column of the output.
Some versions of Unix will track cpu percentages in the kernel, as the figure is
used in the calculation
of a process's priority. On those versions, top will use the figure as calculated
by the kernel.
Other versions of Unix do not perform this calculation, and top must determine the
percentage explicity
by monitoring the changes in cpu time.
On most multiprocessor machines, the number displayed in this column is a
percentage of the total
available cpu capacity. Therefore, a single threaded process running on a four
processor system will never
use more than 25% of the available cpu cycles.

Command
This column displays the name of the executable image that each process is
running.
In most cases this is the base name of the file that was invoked with the most
recent kernel "exec" call.
On most systems, this name is maintained separately from the zeroth argument. A
program that changes
its zeroth argument will not affect the output of this column.

# modinfo
The modinfo command provides information about the modules currently loaded by the
kernel.

The /etc/system file:


Available for Solaris Operating Environment, the /etc/system file contains
definitions for kernel configuration limits
such as the maximum number of users allowed on the system at a time, the maximum
number of processes per user,
and the inter-process communication (IPC) limits on size and number of resources.
These limits are important because
they affect DB2 performance on a Solaris Operating Environment machine. See the
Quick Beginnings information
for further details.

# more /etc/path_to_inst
To see the mapping between the kernel abbreviated instance name for physical
device names,
view the /etc/path_to_inst file.

# uptime
uptime - show how long the system has been up

/export/home/oraclown>uptime
11:32am up 4:19, 1 user, load average: 0.40, 1.17, 0.90

1.2.12 Wellknown tools for AIX:


===============================

1. commands:
------------

CPU Memory Subsystem I/O Subsystem Network Subsystem


---------------------------------------------------------------------------------
vmstat vmstat iostat netstat
iostat lsps vmstat ifconfig
ps svmon lsps tcpdump
sar filemon filemon
tprof ipcs lvmstat

nmon and topas can be used to monitor those subsystems in general.

2. topas:
---------

topas is a useful graphical interface that will give you immediate results of what
is going on in the system.
When you run it without any command-line arguments, the screen looks like this:

Topas Monitor for host: aix4prt EVENTS/QUEUES FILE/TTY


Mon Apr 16 16:16:50 2001 Interval: 2 Cswitch 5984 Readch 4864
Syscall 15776 Writech 34280
Kernel 63.1 |################## | Reads 8 Rawin 0
User 36.8 |########## | Writes 2469 Ttyout 0
Wait 0.0 | | Forks 0 Igets 0
Idle 0.0 | | Execs 0 Namei 4
Runqueue 11.5 Dirblk 0
Network KBPS I-Pack O-Pack KB-In KB-Out Waitqueue 0.0
lo0 213.9 2154.2 2153.7 107.0 106.9
tr0 34.7 16.9 34.4 0.9 33.8 PAGING MEMORY
Faults 3862 Real,MB 1023
Disk Busy% KBPS TPS KB-Read KB-Writ Steals 1580 % Comp 27.0
hdisk0 0.0 0.0 0.0 0.0 0.0 PgspIn 0 % Noncomp 73.9
PgspOut 0 % Client 0.5
Name PID CPU% PgSp Owner PageIn 0
java 16684 83.6 35.1 root PageOut 0 PAGING SPACE
java 12192 12.7 86.2 root Sios 0 Size,MB 512
lrud 1032 2.7 0.0 root % Used 1.2
aixterm 19502 0.5 0.7 root NFS (calls/sec) % Free 98.7
topas 6908 0.5 0.8 root ServerV2 0
ksh 18148 0.0 0.7 root ClientV2 0 Press:
gil 1806 0.0 0.0 root ServerV3 0 "h" for help

The information on the bottom left side shows the most active processes; here,
java is consuming 83.6% of CPU.
The middle right area shows the total physical memory (1 GB in this case) and
Paging space (512 MB),
as well as the amount being used. So you get an excellent overview of what the
system is doing
in a single screen, and then you can select the areas to concentrate based on the
information being shown here.

Note: about waits:


------------------

Don't get caught up in this whole wait i/o thing. a single cpu system
with 1 i/o outstanding and no other runable threads (i.e. idle) will
have 100% wait i/o. There was a big discussion a couple of years ago on
removing the kernel tick as it has confused many many many techs.

So, if you have only 1 or few cpu, then you are going to have high wait i.o
figures, it does not neccessarily mean your disk subsystem is slow.

3. trace:
---------

trace captures a sequential flow of time-stamped system events. The trace is a


valuable tool for observing
system and application execution. While many of the other tools provide high level
statistics such as
CPU and I/O utilization, the trace facility helps expand the information as to
where the events happened,
which process is responsible, when the events took place, and how they are
affecting the system.
Two post processing tools that can extract information from the trace are utld (in
AIX 4) and curt
(in AIX 5). These provide statistics on CPU utilization and process/thread
activity. The third post
processing tool is splat which stands for Simple Performance Lock Analysis Tool.
This tool is used to analyze
lock activity in the AIX kernel and kernel extension for simple locks.
4. nmon:
--------

nmon is a free software tool that gives much of the same information as topas, but
saves the information
to a file in Lotus 123 and Excel format. The download site is
http://www.ibm.com/developerworks/eserver/articles/analyze_aix/.
The information that is collected included CPU, disk, network, adapter statistics,
kernel counters,
memory and the "top" process information.

5. tprof:
---------

tprof is one of the AIX legacy tools that provides a detailed profile of CPU usage
for every
AIX process ID and name. It has been completely rewritten for AIX 5.2, and the
example below uses
the AIX 5.1 syntax. You should refer to AIX 5.2 Performance Tools update: Part 3
for the new syntax.

The simplest way to invoke this command is to use:

# tprof -kse -x "sleep 10"


# tprof -ske -x "sleep 30"

At the end of ten seconds, or 30 seconds, a new file __prof.all, or sleep.prof, is


generated that contains
information about what commands are using CPU on the system. Searching for FREQ,
the information looks something
like the example below:

Process FREQ Total Kernel User Shared Other


======= === ===== ====== ==== ====== =====
oracle 244 10635 3515 6897 223 0
java 247 3970 617 0 2062 1291
wait 16 1515 1515 0 0 0
...
======= === ===== ====== ==== ====== =====
Total 1060 19577 7947 7252 3087 1291

This example shows that over half the CPU time is associated with the oracle
application and that Java
is using about 3970/19577 or 1/5 of the CPU. The wait usually means idle time, but
can also include
the I/O wait portion of the CPU usage.

svmon:
------

The svmon command captures a snapshot of the current state om memory.


use it with the -G switch to get global statistics for the whole system.
svmon is the most useful tool at your disposal when monitoring a Java process,
especially native heap.
The article "When segments collide" gives examples of how to use svmon -P <pid> -m
to monitor the
native heap of a Java process on AIX. But there is another variation, svmon -P
<pid> -m -r, that is very
effective in identifying native heap fragmentation. The -r switch prints the
address range in use, so it gives
a more accurate view of how much of each segment is in use.
As an example, look at the partially edited output below:

Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd LPage


10556 java 681613 2316 2461 501080 N Y N

Vsid Esid Type Description LPage Inuse Pin Pgsp Virtual


22ac4 9 mmap mapped to sid b1475 - 0 0 - -
21047 8 mmap mapped to sid 30fe5 - 0 0 - -
126a2 a mmap mapped to sid 91072 - 0 0 - -
7908c 7 mmap mapped to sid 6bced - 0 0 - -
b2ad6 b mmap mapped to sid b1035 - 0 0 - -
b1475 - work - 65536 0 282 65536
30fe5 - work - 65536 0 285 65536
91072 - work - 65536 0 54 65536
6bced - work - 65536 0 261 65536
b1035 - work - 45054 0 0 45054
Addr Range: 0..45055
e0f9f 5 work shmat/mmap - 48284 0 3 48284
19100 3 work shmat/mmap - 46997 0 463 47210
c965a 4 work shmat/mmap - 46835 0 281 46953
7910c 6 work shmat/mmap - 37070 0 0 37070
Addr Range: 0..50453
e801d d work shared library text - 9172 0 0 9220
Addr Range: 0..30861
a0fb7 f work shared library data - 105 0 1 106
Addr Range: 0..2521
21127 2 work process private - 50 2 1 51
Addr Range: 65300..65535
a8535 1 pers code,/dev/q109waslv:81938 - 11 0 - -
Addr Range: 0..11

Other example:

# svmon -G -i 2 5 # sample five times at two second intervals

memory in use pin pg space


size inuse free pin work pers clnt work pers clnt size inuse
16384 16250 134 2006 10675 2939 2636 2006 0 0 40960 12674
16384 16250 134 2006 10675 2939 2636 2006 0 0 40960 12674
16384 16250 134 2006 10675 2939 2636 2006 0 0 40960 12674
16384 16250 134 2006 10675 2939 2636 2006 0 0 40960 12674
16384 16250 134 2006 10675 2939 2636 2006 0 0 40960 12674

In this example, there are 16384 pages of total size of memory. Multuply this
number by 4096
to see the total real memory size. In this case the total memory is 64 MB.
filemon:
--------

filemon can be used to identify the files that are being used most actively. This
tool gives a very
comprehensive view of file access, and can be useful for drilling down once
vmstat/iostat confirm disk
to be a bottleneck.

Example:

# filemon -o /tmp/filemon.log; sleep 60; trcstop

The generated log file is quite large. Some sections that may be useful are:

Most Active Files


------------------------------------------------------------------------
#MBs #opns #rds #wrs file volume:inode
------------------------------------------------------------------------

25.7 83 6589 0 unix /dev/hd2:147514


16.3 1 4175 0 vxe102 /dev/mailv1:581
16.3 1 0 4173 .vxe102.pop /dev/poboxv:62
15.8 1 1 4044 tst1 /dev/mailt1:904
8.3 2117 2327 0 passwd /dev/hd4:8205
3.2 182 810 1 services /dev/hd4:8652
...
------------------------------------------------------------------------
Detailed File Stats
------------------------------------------------------------------------

FILE: /var/spool/mail/v/vxe102 volume: /dev/mailv1 (/var/spool2/mail/v)


inode: 581
opens: 1
total bytes xfrd: 17100800
reads: 4175 (0 errs)
read sizes (bytes): avg 4096.0 min 4096 max 4096 sdev 0.0
read times (msec): avg 0.543 min 0.011 max 78.060 sdev 2.753
...

curt:
-----

curt Command
Purpose
The CPU Utilization Reporting Tool (curt) command converts an AIX trace file into
a number of statistics related
to CPU utilization and either process, thread or pthread activity. These
statistics ease the tracking of
specific application activity. curt works with both uniprocessor and
multiprocessor AIX Version 4 and AIX Version 5
traces.

Syntax
curt -i inputfile [-o outputfile] [-n gennamesfile] [-m trcnmfile] [-a
pidnamefile] [-f timestamp]
[-l timestamp] [-ehpstP]

Description
The curt command takes an AIX trace file as input and produces a number of
statistics related to
processor (CPU) utilization and process/thread/pthread activity. It will work with
both uniprocessor and
multiprocessor AIX traces if the processor clocks are properly synchronized.

1.2.13 Not so well known tools for AIX: the proc tools:
=======================================================

--proctree
Displays the process tree containing the specified process IDs or users. To
display the ancestors
and all the children of process 12312, enter:

# proctree 21166
11238 /usr/sbin/srcmstr
21166 /usr/sbin/rsct/bin/IBM.AuditRMd

To display the ancestors and children of process 21166, including children of


process 0, enter:

#proctree �a 21166
1 /etc/init
11238 /usr/sbin/srcmstr
21166 /usr/sbin/rsct/bin/IBM.AuditRMd

-- procstack
Displays the hexadecimal addresses and symbolic names for each of the stack frames
of the current thread
in processes. To display the current stack of process 15052, enter:

# procstack 15052
15052 : /usr/sbin/snmpd
d025ab80 select (?, ?, ?, ?, ?) + 90
100015f4 main (?, ?, ?) + 1814
10000128 __start () + 8c

Currently, procstack displays garbage or wrong information for the top stack
frame, and possibly for the
second top stack frame. Sometimes it will erroneously display "No frames found on
the stack," and sometimes
it will display: deadbeef ???????? (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ...) The
fix for this problem had not
been released at the writing of this article. When the fix becomes available, you
need to download the
APAR IY48543 for 5.2. For AIX 5.3 it all should work OK.

-- procmap
Displays a process address map. To display the address space of process 13204,
enter:

# procmap 13204
13204 : /usr/sbin/biod 6
10000000 3K read/exec biod
20000910 0K read/write biod
d0083100 79K read/exec /usr/lib/libiconv.a
20013bf0 41K read/write /usr/lib/libiconv.a
d007a100 34K read/exec /usr/lib/libi18n.a
20011378 4K read/write /usr/lib/libi18n.a
d0074000 11K read/exec /usr/lib/nls/loc/en_US
d0077130 8K read/write /usr/lib/nls/loc/en_US
d00730f8 2K read/exec /usr/lib/libcrypt.a
f03c7508 0K read/write /usr/lib/libcrypt.a
d01d4e20 1997K read/exec /usr/lib/libc.a
f0337e90 570K read/write /usr/lib/libc.a

-- procldd
Displays a list of libraries loaded by a process. To display the list of dynamic
libraries loaded by
process 11928, enter

# procldd 11928. T
11928 : -sh
/usr/lib/nls/loc/en_US
/usr/lib/libcrypt.a
/usr/lib/libc.a

-- procflags
Displays a process tracing flags, and the pending and holding signals. To display
the tracing flags of
process 28138, enter:

# procflags 28138
28138 : /usr/sbin/rsct/bin/IBM.HostRMd
data model = _ILP32 flags = PR_FORK
/64763: flags = PR_ASLEEP | PR_NOREGS
/66315: flags = PR_ASLEEP | PR_NOREGS
/60641: flags = PR_ASLEEP | PR_NOREGS
/66827: flags = PR_ASLEEP | PR_NOREGS
/7515: flags = PR_ASLEEP | PR_NOREGS
/70439: flags = PR_ASLEEP | PR_NOREGS
/66061: flags = PR_ASLEEP | PR_NOREGS
/69149: flags = PR_ASLEEP | PR_NOREGS

-- procsig
Lists the signal actions for a process. To list all the signal actions defined for
process 30552, enter:

# procsig 30552
30552 : -ksh
HUP caught
INT caught
QUIT caught
ILL caught
TRAP caught
ABRT caught
EMT caught
FPE caught
KILL default RESTART BUS caught

-- proccred
Prints a process' credentials. To display the credentials of process 25632, enter:

# proccred 25632
25632: e/r/suid=0 e/r/sgid=0

-- procfiles
Prints a list of open file descriptors. To display status and control information
on the file descriptors
opened by process 20138, enter:

# procfiles �n 20138
20138 : /usr/sbin/rsct/bin/IBM.CSMAgentRMd
Current rlimit: 2147483647 file descriptors
0: S_IFCHR mode:00 dev:10,4 ino:4178 uid:0 gid:0 rdev:2,2
O_RDWR name:/dev/null
2: S_IFREG mode:0311 dev:10,6 ino:250 uid:0 gid:0 rdev:0,0
O_RDWR size:0 name:/var/ct/IBM.CSMAgentRM.stderr
4: S_IFREG mode:0200 dev:10,6 ino:255 uid:0 gid:0 rdev:0,0

-- procwdx
Prints the current working directory for a process. To display the current working
directory
of process 11928, enter:

# procwdx 11928
11928 : /home/guest

-- procstop
Stops a process. To stop process 7500 on the PR_REQUESTED event, enter:

# procstop 7500 .

-- procrun
Restart a process. To restart process 30192 that was stopped on the PR_REQUESTED
event, enter:

# procrun 30192 .

-- procwait
Waits for all of the specified processes to terminate. To wait for process 12942
to exit and display
the status, enter

# procwait -v 12942 .
12942 : terminated, exit status 0
1.2.14 Other monitoring:
========================

Nagios: open source Monitoring for most unix systems:


-----------------------------------------------------

Nagios is an open source host, service and network monitoring program.

Latest versions: 2.5 (stable)

Overview

Nagios is a host and service monitor designed to inform you of network problems
before your clients,
end-users or managers do. It has been designed to run under the Linux operating
system, but works fine
under most *NIX variants as well. The monitoring daemon runs intermittent checks
on hosts and services you specify
using external "plugins" which return status information to Nagios. When problems
are encountered,
the daemon can send notifications out to administrative contacts in a variety of
different ways
(email, instant message, SMS, etc.). Current status information, historical logs,
and reports can all
be accessed via a web browser.

System Requirements

The only requirement of running Nagios is a machine running Linux (or UNIX
variant) and a C compiler.
You will probably also want to have TCP/IP configured, as most service checks will
be performed over the network.

You are not required to use the CGIs included with Nagios. However, if you do
decide to use them,
you will need to have the following software installed...

- A web server (preferrably Apache)


- Thomas Boutell's gd library version 1.6.3 or higher (required by the statusmap
and trends CGIs)

rstat: Monitoring Machine Utilization with rstat:


-------------------------------------------------

rstat stands for Remote System Statistics service

Ports exist for most unixes, like Linux, Solaris, AIX etc..

-- rstat on Linux, Solaris:

rstat is an RPC client program to get and print statistics from any machine
running the rpc.rstatd daemon,
its server-side counterpart. The rpc.rstad daemon has been used for many years by
tools such as Sun's perfmeter
and the rup command. The rstat program is simply a new client for an old daemon.
The fact that the rpc.rstatd daemon
is already installed and running on most Solaris and Linux machines is a huge
advantage over other tools
that require the installation of custom agents.

The rstat client compiles and runs on Solaris and Linux as well and can get
statistics from any machine running
a current rpc.rstatd daemon, such as Solaris, Linux, AIX, and OpenBSD. The
rpc.rstatd daemon is started
from /etc/inetd.conf on Solaris. It is similar to vmstat, but has some advantages
over vmstat:

You can get statistics without logging in to the remote machine, including over
the Internet.

It includes a timestamp.

The output can be plotted directly by gnuplot.

The fact that it runs remotely means that you can use a single central machine to
monitor the performance
of many remote machines. It also has a disadvantage in that it does not give the
useful scan rate measurement
of memory shortage, the sr column in vmstat. rstat will not work across most
firewalls because it relies on
port 111, the RPC port, which is usually blocked by firewalls.

To use rstat, simply give it the name or IP address of the machine you wish to
monitor. Remember that rpc.rstatd
must be running on that machine. The rup command is extremely useful here because
with no arguments,
it simply prints out a list of all machines on the local network that are running
the rstatd demon.
If a machine is not listed, you may have to start rstatd manually.

To start rpc.rstatd under Red Hat Linux, run

# /etc/rc.d/init.d/rstatd start as root.

On Solaris, first try running the rstat client because inetd is often already
configured to automatically
start rpc.rstatd on request. If it the client fails with the error "RPC: Program
not registered,"
make sure you have this line in your /etc/inet/inetd.conf and kill -HUP your inetd
process to get it to
re-read inetd.conf, as follows:

rstatd/2-4 tli rpc/datagram_v wait root /usr/lib/netsvc/rstat/rpc.rstatd


rpc.rstatd

Then you can monitor that machine like this:

% rstat enkidu
2001 07 10 10 36 08 0 0 0 100 0 27 54 1 0 0 12 0.1
This command will give you a one-second average and then it will exit. If you want
to continuously monitor,
give an interval in seconds on the command line. Here's an example of one line of
output every two seconds:

% rstat enkidu 2
2001 07 10 10 36 28 0 0 1 98 0 0 7 2 0 0 61 0.0
2001 07 10 10 36 30 0 0 0 100 0 0 0 2 0 0 15 0.0
2001 07 10 10 36 32 0 0 0 100 0 0 0 2 0 0 15 0.0
2001 07 10 10 36 34 0 0 0 100 0 5 10 2 0 0 19 0.0
2001 07 10 10 36 36 0 0 0 100 0 0 46 2 0 0 108 0.0
^C

To get a usage message, the output format, the version number, and where to go for
updates, just type rstat
with no parameters:

% rstat
usage: rstat machine [interval]
output:
yyyy mm dd hh mm ss usr wio sys idl pgin pgout intr ipkts opkts coll cs load
docs and src at http://patrick.net/software/rstat/rstat.html

Notice that the column headings line up with the output data.

-- AIX:

In order to get rstat working on AIX, you may need to configure rstatd.

As root

1. Edit /etc/inetd.conf
Uncomment or add entry for rstatd
Eg
rstatd sunrpc_udp udp wait root /usr/sbin/rpc.rstatd rstatd 100001 1-3

2. Edit /etc/services
Uncomment or add entry for rstatd
Eg
rstatd 100001/udp

3. Refresh services
refresh -s inetd

4. Start rstatd
/usr/sbin/rpc.rstatd

==================================
2. NFS and Mount command examples:
==================================

2.1 NFS:
========

We will discuss the most important feaures of NFS, by showing how its implemented
on
Solaris, Redhat and SuSE Linux. Most of this applies to HP-UX and AIX as well.

2.1.1 NFS and Redhat Linux:


---------------------------

Linux uses a combination of kernel-level support and continuously running daemon


processes to provide
NFS file sharing, however, NFS support must be enabled in the Linux kernel to
function.
NFS uses Remote Procedure Calls (RPC) to route requests between clients and
servers, meaning that the
portmap service must be enabled and active at the proper runlevels for NFS
communication to occur.
Working with portmap, various other processes ensure that a particular NFS
connection is allowed and may
proceed without error:

rpc.mountd � The running process that receives the mount request from an NFS
client and checks to see
if it matches with a currently exported file system.
rpc.nfsd � The process that implements the user-level part of the NFS service.
It works with the Linux kernel
to meet the dynamic demands of NFS clients, such as providing
additional server threads for
NFS clients to uses.
rpc.lockd � A daemon that is not necessary with modern kernels. NFS file locking
is now done by the kernel.
It is included with the nfs-utils package for users of older kernels
that do not include this
functionality by default.
rpc.statd � Implements the Network Status Monitor (NSM) RPC protocol. This
provides reboot notification
when an NFS server is restarted without being gracefully brought
down.
rpc.rquotad � An RPC server that provides user quota information for remote users.

Not all of these programs are required for NFS service. The only services that
must be enabled are rpc.mountd,
rpc.nfsd, and portmap. The other daemons provide additional functionality and
should only be used if your server
environment requires them.

NFS version 2 uses the User Datagram Protocol (UDP) to provide a stateless network
connection between
the client and server. NFS version 3 can use UDP or TCP running over an IP. The
stateless UDP connection
minimizes network traffic, as the NFS server sends the client a cookie after the
client is authorized
to access the shared volume. This cookie is a random value stored on the server's
side and is passed
with along with RPC requests from the client. The NFS server can be restarted
without affecting the clients
and the cookie will remain intact.

NFS only performs authentication when a client system attempts to mount a remote
file system. To limit access,
the NFS server first employs TCP wrappers. TCP wrappers reads the /etc/hosts.allow
and /etc/hosts.deny files
to determine if a particular client should be permitted or prevented access to the
NFS server.
After the client is allowed past TCP wrappers, the NFS server refers to its
configuration file,
"/etc/exports", to determine whether the client has enough privileges to mount any
of the exported file systems.
After granting access, any file and directory operations are sent to the server
using remote procedure calls.

Warning
NFS mount privileges are granted specifically to a client, not a user. If you
grant a client machine access
to an exported file system, any users of that machine will have access to the
data.

When configuring the /etc/exports file, be extremely careful about granting read-
write permissions
(rw) to a remote host.

-- NFS and portmap


NFS relies upon remote procedure calls (RPC) to function. portmap is required to
map RPC requests to the
correct services. RPC processes notify portmap when they start, revealing the port
number they are monitoring
and the RPC program numbers they expect to serve. The client system then contacts
portmap on the server with
a particular RPC program number. portmap then redirects the client to the proper
port number to communicate
with its intended service.

Because RPC-based services rely on portmap to make all connections with incoming
client requests,
portmap must be available before any of these services start. If, for some reason,
the portmap service
unexpectedly quits, restart portmap and any services running when it was started.

The portmap service can be used with the host access files (/etc/hosts.allow and
/etc/hosts.deny) to control
which remote systems are permitted to use RPC-based services on your machine.
Access control rules for portmap
will affect all RPC-based services. Alternatively, you can specify each of the NFS
RPC daemons to be affected
by a particular access control rule. The man pages for rpc.mountd and rpc.statd
contain information regarding
the precise syntax of these rules.

-- portmap Status
As portmap provides the coordination between RPC services and the port numbers
used to communicate with them,
it is useful to be able to get a picture of the current RPC services using portmap
when troubleshooting.
The rpcinfo command shows each RPC-based service with its port number, RPC program
number, version,
and IP protocol type (TCP or UDP).
To make sure the proper NFS RPC-based services are enabled for portmap, rpcinfo -p
can be useful:

# rpcinfo -p

program vers proto port


100000 2 tcp 111 portmapper
100000 2 udp 111 portmapper
100024 1 udp 1024 status
100024 1 tcp 1024 status
100011 1 udp 819 rquotad
100011 2 udp 819 rquotad
100005 1 udp 1027 mountd
100005 1 tcp 1106 mountd
100005 2 udp 1027 mountd
100005 2 tcp 1106 mountd
100005 3 udp 1027 mountd
100005 3 tcp 1106 mountd
100003 2 udp 2049 nfs
100003 3 udp 2049 nfs
100021 1 udp 1028 nlockmgr
100021 3 udp 1028 nlockmgr
100021 4 udp 1028 nlockmgr

The -p option probes the portmapper on the specified host or defaults to localhost
if no specific host is listed.
Other options are available from the rpcinfo man page.
From the output above, various NFS services can be seen running. If one of the NFS
services does not start up
correctly, portmap will be unable to map RPC requests from clients for that
service to the correct port.
In many cases, restarting NFS as root (/sbin/service nfs restart) will cause those
service to correctly
register with portmap and begin working.

# /sbin/service nfs restart

-- NFS Server Configuration Files


Configuring a system to share files and directories using NFS is straightforward.
Every file system being
exported to remote users via NFS, as well as the access rights relating to those
file systems,
is located in the /etc/exports file. This file is read by the exportfs command to
give rpc.mountd and rpc.nfsd
the information necessary to allow the remote mounting of a file system by an
authorized host.

The exportfs command allows you to selectively export or unexport directories


without restarting the various
NFS services. When exportfs is passed the proper options, the file systems to be
exported are written to
/var/lib/nfs/xtab. Since rpc.mountd refers to the xtab file when deciding access
privileges to a file system,
changes to the list of exported file systems take effect immediately.
Various options are available when using exportfs:

-r � Causes all directories listed in /etc/exports to be exported by constructing


a new export list in
/etc/lib/nfs/xtab. This option effectively refreshes the export list with any
changes that have been
made to /etc/exports.

-a � Causes all directories to be exported or unexported, depending on the other


options passed to exportfs.

-o options � Allows the user to specify directories to be exported that are not
listed in /etc/exports.
These additional file system shares must be written in the same way they are
specified in /etc/exports.
This option is used to test an exported file system before adding it
permanently to the list of file systems
to be exported.

-i � Tells exportfs to ignore /etc/exports; only options given from the command
line are used to define
exported file systems.

-u � Unexports directories from being mounted by remote users. The command


exportfs -ua effectively suspends
NFS file sharing while keeping the various NFS daemons up. To allow NFS
sharing to continue, type exportfs -r.

-v � Verbose operation, where the file systems being exported or unexported are
displayed in greater detail
when the exportfs command is executed.

If no options are passed to the exportfs command, it displays a list of currently


exported file systems.

Changes to /etc/exports can also be read by reloading the NFS service with the
service nfs reload command.
This keeps the NFS daemons running while re-exporting the /etc/exports file.

-- /etc/exports
The /etc/exports file is the standard for controlling which file systems are
exported to which hosts,
as well as specifying particular options that control everything. Blank lines are
ignored, comments can be made
using #, and long lines can be wrapped with a backslash (\). Each exported file
system should be on its own line.
Lists of authorized hosts placed after an exported file system must be separated
by space characters.
Options for each of the hosts must be placed in parentheses directly after the
host identifier, without any spaces
separating the host and the first parenthesis.

In its simplest form, /etc/exports only needs to know the directory to be exported
and the hosts
permitted to use it:
/some/directory bob.domain.com
/another/exported/directory 192.168.0.3

n5111sviob

After re-exporting /etc/exports with the "/sbin/service nfs reload" command, the
bob.domain.com host will be
able to mount /some/directory and 192.168.0.3 can mount
/another/exported/directory. Because no options
are specified in this example, several default NFS preferences take effect.

In order to override these defaults, you must specify an option that takes its
place. For example, if you do
not specify rw, then that export will only be shared read-only. Each default for
every exported file system
must be explicitly overridden. Additionally, other options are available where no
default value is in place.
These include the ability to disable sub-tree checking, allow access from insecure
ports, and allow insecure
file locks (necessary for certain early NFS client implementations). See the
exports man page for details
on these lesser used options.

When specifying hostnames, you can use the following methods:

single host � Where one particular host is specified with a fully qualified domain
name, hostname, or IP address.

wildcards � Where a * or ? character is used to take into account a grouping of


fully qualified domain names
that match a particular string of letters. Wildcards are not to be
used with IP addresses; however,
they may accidently work if reverse DNS lookups fail.

However, be careful when using wildcards with fully qualified domain names, as
they tend to be more exact
than you would expect. For example, the use of *.domain.com as wildcard will allow
sales.domain.com to access
the exported file system, but not bob.sales.domain.com. To match both
possibilities, as well as
sam.corp.domain.com, you would have to provide *.domain.com *.*.domain.com.

IP networks � Allows the matching of hosts based on their IP addresses within a


larger network. For example,
192.168.0.0/28 will allow the first 16 IP addresses, from
192.168.0.0 to 192.168.0.15,
to access the exported file system but not 192.168.0.16 and higher.

netgroups � Permits an NIS netgroup name, written as @<group-name>, to be used.


This effectively puts the
NIS server in charge of access control for this exported file
system, where users can be added
and removed from an NIS group without affecting /etc/exports.

Warning
The way in which the /etc/exports file is formatted is very important,
particularly concerning the use of
space characters. Remember to always separate exported file systems from hosts
and hosts from one another
with a space character. However, there should be no other space characters in
the file unless they are used
in comment lines.

For example, the following two lines do not mean the same thing:

/home bob.domain.com(rw)
/home bob.domain.com (rw)

The first line allows only users from bob.domain.com read-write access to the
/home directory.
The second line allows users from bob.domain.com to mount the directory read-
only (the default), but the rest
of the world can mount it read-write. Be careful where space characters are used
in /etc/exports.

-- NFS Client Configuration Files - What to do on a client?

Any NFS share made available by a server can be mounted using various methods. Of
course, the share can be
manually mounted, using the mount command, to acquire the exported file system at
a particular mount point.
However, this requires that the root user type the mount command every time the
system restarts.
In addition, the root user must remember to unmount the file system when shutting
down the machine.
Two methods of configuring NFS mounts include modifying the /etc/fstab or using
the autofs service.

> /etc/fstab
Placing a properly formatted line in the /etc/fstab file has the same effect as
manually mounting the
exported file system. The /etc/fstab file is read by the /etc/rc.d/init.d/netfs
script at system startup.
The proper file system mounts, including NFS, are put into place.

A sample /etc/fstab line to mount an NFS export looks like the following:

<server>:</path/of/dir> </local/mnt/point> nfs <options> 0 0

The <server-host> relates to the hostname, IP address, or fully qualified domain


name of the server exporting
the file system. The </path/to/shared/directory> tells the server what export to
mount.
The </local/mount/point> specifies where on the local file system to mount the
exported directory.
This mount point must exist before /etc/fstab is read or the mount will fail. The
nfs option specifies
the type of file system being mounted.

The <options> area specifies how the file system is to be mounted. For example, if
the options
area states rw,suid on a particular mount, the exported file system will be
mounted read-write and the
user and group ID set by the server will be used. Note, parentheses are not to be
used here.

2.1.2 NFS and SuSE Linux:


-------------------------

-- Importing File Systems with YaST

Any user authorized to do so can mount NFS directories from an NFS server into his
own file tree.
This can be achieved most easily using the YaST module �NFS Client�. Just enter
the host name of the NFS server,
the directory to import, and the mount point at which to mount this directory
locally.
All this is done after clicking �Add� in the first dialog.

-- Importing File Systems Manually

File systems can easily be imported manually from an NFS server. The only
prerequisite is a running
RPC port mapper, which can be started by entering the command
# rcportmap start

as root. Once this prerequisite is met, remote file systems exported on the
respective machines
can be mounted in the file system just like local hard disks using the command
mount with the following syntax:

# mount host:remote-path local-path

If user directories from the machine sun, for example, should be imported, the
following command can be used:

# mount sun:/home /home

-- Exporting File Systems with YaST

With YaST, turn a host in your network into an NFS server � a server that exports
directories and files
to all hosts granted access to it. This could be done to provide applications to
all coworkers of a group
without installing them locally on each and every host. To install such a server,
start YaST and select
�Network Services� -> �NFS Server�

Next, activate �Start NFS Server� and click �Next�. In the upper text field, enter
the directories to export.
Below, enter the hosts that should have access to them.
There are four options that can be set for each host: single host, netgroups,
wildcards, and IP networks.
A more thorough explanation of these options is provided by man exports. �Exit�
completes the configuration.
-- Exporting File Systems Manually

If you do not want to use YaST, make sure the following systems run on the NFS
server:

RPC portmapper (portmap)


RPC mount daemon (rpc.mountd)
RPC NFS daemon (rpc.nfsd)

For these services to be started by the scripts "/etc/init.d/portmap" and


"/etc/init.d/nfsserver"
when the system is booted, enter the commands

# insserv /etc/init.d/nfsserver and


# insserv /etc/init.d/portmap.

Also define which file systems should be exported to which host in the
configuration file "/etc/exports".

For each directory to export, one line is needed to set which machines may access
that directory
with what permissions. All subdirectories of this directory are automatically
exported as well.
Authorized machines are usually specified with their full names (including domain
name), but it is possible
to use wild cards like * or ? (which expand the same way as in the Bash shell). If
no machine is specified here,
any machine is allowed to import this file system with the given permissions.

Set permissions for the file system to export in brackets after the machine name.
The most important options are:

ro File system is exported with read-only permission (default).


rw File system is exported with read-write permission.
root_squash This makes sure the user root of the given machine does not have
root permissions
on this file system. This is achieved by assigning user ID 65534
to users with user ID 0 (root).
This user ID should be set to nobody (which is the default).
no_root_squash Does not assign user ID 0 to user ID 65534, keeping the root
permissions valid.
link_relative Converts absolute links (those beginning with /) to a sequence of
../.
This is only useful if the entire file system of a machine is
mounted (default).
link_absolute Symbolic links remain untouched.
map_identity User IDs are exactly the same on both client and server
(default).
map_daemon Client and server do not have matching user IDs. This tells nfsd to
create a conversion table
for user IDs. The ugidd daemon is required for this to work.

/etc/exports is read by mountd and nfsd. If you change anything in this file,
restart mountd and nfsd
for your changes to take effect. This can easily be done with "rcnfsserver
restart".
Example SuSE /etc/exports

#
# /etc/exports
#
/home sun(rw) venus(rw)
/usr/X11 sun(ro) venus(ro)
/usr/lib/texmf sun(ro) venus(rw)
/ earth(ro,root_squash)
/home/ftp (ro)
# End of exports

2.2 Mount command:


==================

The standard form of the mount command, is

mount -F typefs device mountdir (solaris, HP-UX)


mount -t typefs device mountdir (many other unix's)

This tells the kernel to attach the file system found on "device" (which is of
type type)
at the directory "dir".
The previous contents (if any) and owner and mode of dir become invisible,
and as long as this file system remains mounted,
the pathname dir refers to the root of the file system on device.

The syntax is:


mount [options] [type] [device] [mountpoint]

-- mounting a remote filesystem:

syntax: mount -F nfs <options> <-o specific options> -O <server>:<filesystem>


<local_mount_point>

# mount -F nfs hpsrv:/data /data


# mount -F nfs -o hard,intr thor:/data /data

- standard mounts are determined by files like /etc/fstab (HP-UX) or


/etc/filesystems (AIX) or /etc/vfstab etc..

2.2.1 Where are the standard mounts defined?


============================================

In Solaris:
===========

- standard mounts are determined by /etc/vfstab etc..


- NFS mounts are determined by the file /etc/dfs/dfstab. Here you will find share
commands.
- currently mounted filesystems are listed in /etc/mnttab
In Linux:
=========

- standard mounts are determined by most Linux distros by "/etc/fstab".

In AIX:
=======

- standard mounts and properties are determined by the file "/etc/filesystems".

In HP-UX:
=========

There is a /etc/fstab which contains all of the filesystems are mounted at boot
time.
The filesystems that are OS related are / , /var, /opt , /tmp, /usr , /stand

The filesystem that is special is /stand, this is where your kernel is built and
resides.
Notice that the filesystem type is "hfs". HPUX kernels MUST reside on an hfs
filesystem

An example of /etc/vfstab:
--------------------------

starboss:/etc $ more vfstab


#device device mount FS fsck mount mount
#to mount to fsck point type pass at boot options
#
fd - /dev/fd fd - no -
/proc - /proc proc - no -
/dev/md/dsk/d1 - - swap - no -
/dev/md/dsk/d0 /dev/md/rdsk/d0 / ufs 1 no logging
/dev/md/dsk/d4 /dev/md/rdsk/d4 /usr ufs 1 no logging
/dev/md/dsk/d3 /dev/md/rdsk/d3 /var ufs 1 no logging
/dev/md/dsk/d7 /dev/md/rdsk/d7 /export ufs 2 yes logging
/dev/md/dsk/d5 /dev/md/rdsk/d5 /usr/local ufs 2 yes logging
/dev/dsk/c2t0d0s0 /dev/rdsk/c2t0d0s0 /export2 ufs 2 yes
logging
swap - /tmp tmpfs - yes size=512m

mount adds an entry, umount deletes an entry.


mounting applies to local filesystemes, or remote filesystems via NFS

At Remote server:
share, shareall, or add entry in /etc/dfs/dfstab
# share -F nfs /var/mail

Unmount a mounted FS

First check who is using it


# fuser -c mountpoint
# umount mointpoint
2.2.2 Mounting a NFS filesystem in HP-UX:
=========================================

Mounting Remote File Systems


You can use either SAM or the mount command to mount file systems located on a
remote system.

Before you can mount file systems located on a remote system, NFS software must be
installed and
configured on both local and remote systems. Refer to Installing and Administering
NFS for information.

For information on mounting NFS file systems using SAM, see SAM's online help.

To mount a remote file system using HP-UX commands,

You must know the name of the host machine and the file system's directory on the
remote machine.
Establish communication over a network between the local system (that is, the
"client") and the
remote system. (The local system must be able to reach the remote system via
whatever hosts database is in use.)
(See named(1M) and hosts(4).) If necessary, test the connection with
/usr/sbin/ping; see ping(1M).

Make sure the file /etc/exports on the remote system lists the file systems that
you wish to make available
to clients (that is, to "export") and the local systems that you wish to mount the
file systems.

For example, to allow machines called rolf and egbert to remotely mount the /usr
file system, edit the file
/etc/exports on the remote machine and include the line:

/usr rolf egbert

Execute /usr/sbin/exportfs -a on the remote system to export all directories in


/etc/exports to clients.

For more information, see exportfs(1M).

NOTE: If you wish to invoke exportfs -a at boot time, make sure the NFS
configuration file /etc/rc.config.d/nfsconf
on the remote system contains the following settings: NFS_SERVER=1 and
START_MOUNTD=1.
The client's /etc/rc.config.d/nfsconf file must contain NFS_CLIENT=1. Then issue
the following command
to run the script:
/sbin/init.d/nfs.server start

Mount the file system on the local system, as in:

# mount -F nfs remotehost:/remote_dir /local_dir


Just a bunch of mount command examples:
---------------------------------------

# mount
# mount -a
# mountall -l
# mount -t type device dir
# mount -F pcfs /dev/dsk/c0t0d0p0:c /pcfs/c
# mount /dev/md/dsk/d7 /u01
# mount sun:/home /home
# mount -t nfs 137.82.51.1:/share/sunos/local /usr/local
# mount /dev/fd0 /mnt/floppy
# mount -o ro /dev/dsk/c0t6d0s1 /mnt/cdrom
# mount -V cdrfs -o ro /dev/cd0 /cdrom

2.2.3 Solaris mount command:


============================

The unix mount command is used to mount a filesystem, and it attaches disks, and
directories logically
rather than physically. It takes a minimum of two arguments:

1) the name of the special device which contains the filesystem


2) the name of an existing directory on which to mount the file system

Once the file system is mounted, the directory becomes the mount point. All the
file systems will now be usable
as if they were subdirectories of the file system they were mounted on. The table
of currently mounted file systems
can be found by examining the mounted file system information file. This is
provided by a file system that is usually
mounted on /etc/mnttab.

Mounting a file system causes three actions to occur:

1. The superblock for the mounted file system is read into memory
2. An entry is made in the /etc/mnttab file
3. An entry is made in the inode for the directory on which the file system is
mounted which marks the directory
as a mount point

The /etc/mountall command mounts all filesystems as described in the /etc/vfstab


file.
Note that /etc/mount and /etc/mountall commands can only be executed by the
superuser.

OPTIONS

-F FSType
Used to specify the FSType on which to operate. The FSType must be specified or
must be determinable from
/etc/vfstab, or by consulting /etc/default/fs or /etc/dfs/fstypes.

-a [ mount_points. . . ]
Perform mount or umount operations in parallel, when possible.
If mount points are not specified, mount will mount all file systems whose
/etc/vfstab "mount at boot"
field is "yes". If mount points are specified, then /etc/vfstab "mount at boot"
field will be ignored.

If mount points are specified, umount will only umount those mount points. If none
is specified, then umount
will attempt to unmount all file systems in /etc/mnttab, with the exception of
certain system
required file systems: /, /usr, /var, /var/adm, /var/run, /proc, /dev/fd and /tmp.

-f Forcibly unmount a file system.


Without this option, umount does not allow a file system to be unmounted if a
file on the file system is
busy. Using this option can cause data loss for open files; programs which
access files after the file sys-
tem has been unmounted will get an error (EIO).

-p Print the list of mounted file systems in the /etc/vfstab format. Must be the
only option specified.

-v Print the list of mounted file systems in verbose format. Must be the only
option specified.

-V Echo the complete command line, but do not execute the command. umount
generates a command line by using the
options and arguments provided by the user and adding to them information
derived from /etc/mnttab. This
option should be used to verify and validate the command line.

generic_options
Options that are commonly supported by most FSType-specific command modules. The
following options are
available:

-m Mount the file system without making an entry in /etc/mnttab.

-g Globally mount the file system. On a clustered system, this globally mounts the
file system on
all nodes of the cluster. On a non-clustered system this has no effect.

-o Specify FSType-specific options in a comma separated (without spaces) list of


suboptions
and keyword-attribute pairs for interpretation by the FSType-specific module of
the command.
(See mount_ufs(1M))

-O Overlay mount. Allow the file system to be mounted over an existing mount
point, making
the underlying file system inaccessible. If a mount is attempted on a pre-
existing mount point
without setting this flag, the mount will fail, producing the error "device
busy".

-r Mount the file system read-only.


Example mountpoints and disks:
------------------------------

Mountpunt Device Omvang Doel


/ /dev/md/dsk/d1 100 Unix Root-filesysteem
/usr /dev/md/dsk/d3 1200 Unix usr-filesysteem
/var /dev/md/dsk/d4 200 Unix var-filesysteem
/home /dev/md/dsk/d5 200 Unix opt-filesysteem
/opt /dev/md/dsk/d6 4700 Oracle_Home
/u01 /dev/md/dsk/d7 8700 Oracle datafiles
/u02 /dev/md/dsk/d8 8700 Oracle datafiles
/u03 /dev/md/dsk/d9 8700 Oracle datafiles
/u04 /dev/md/dsk/d10 8700 Oracle datafiles
/u05 /dev/md/dsk/d110 8700 Oracle datafiles
/u06 /dev/md/dsk/d120 8700 Oracle datafiles
/u07 /dev/md/dsk/d123 8650 Oracle datafiles

Suppose you have only 1 disk of about 72GB, 2GB RAM:

Entire disk= Slice 2

/ Slice 0, partition about 2G


swap Slice 1, partition about 4G
/export Slice 3, partition about 50G, maybe you link it to /u01
/var Slice 4, partition about 2G
/opt Slice 5, partition about 10G if you plan to install apps here
/usr Slice 6, partition about 2G
/u01 Slice 7, partition optional, standard it's /home
Depending on how you configure /export, size could be around 20G

find . -name dfctowdk\*.zip | while read file; do pkzip25 -extract -translate=unix


->

2.2.4 mount command on AIX:


===========================

Typical examples:

# mount -o soft 10.32.66.75:/data/nim /mnt


# mount -o soft abcsrv:/data/nim /mnt
# mount -o soft n580l03:/data/nim /mnt

Note 1:
-------

mount [ -f ] [ -n Node ] [ -o Options ] [ -p ] [ -r ] [ -v VfsName ] [ -t Type | [


Device | Node:Directory ]
Directory | all | -a ] [-V [generic_options] special_mount_points

If you specify only the Directory parameter, the mount command takes it to be the
name of the directory or file on which
a file system, directory, or file is usually mounted (as defined in the
/etc/filesystems file). The mount command looks up
the associated device, directory, or file and mounts it. This is the most
convenient way of using the mount command,
because it does not require you to remember what is normally mounted on a
directory or file. You can also specify only
the device. In this case, the command obtains the mount point from the
/etc/filesystems file.

The /etc/filesystems file should include a stanza for each mountable file system,
directory, or file. This stanza should
specify at least the name of the file system and either the device on which it
resides or the directory name.
If the stanza includes a mount attribute, the mount command uses the associated
values. It recognizes five values
for the mount attributes: automatic, true, false, removable, and readonly.

The mount all command causes all file systems with the mount=true attribute to be
mounted in their normal places.
This command is typically used during system initialization, and the corresponding
mounts are referred to as
automatic mounts.

Example mount command on AIX:


-----------------------------

$ mount

node mounted mounted over vfs date options


-------- --------------- --------------- ------ ------------ ---------------
/dev/hd4 / jfs2 Jun 06 17:15 rw,log=/dev/hd8
/dev/hd2 /usr jfs2 Jun 06 17:15 rw,log=/dev/hd8
/dev/hd9var /var jfs2 Jun 06 17:15 rw,log=/dev/hd8
/dev/hd3 /tmp jfs2 Jun 06 17:15 rw,log=/dev/hd8
/dev/hd1 /home jfs2 Jun 06 17:16 rw,log=/dev/hd8
/proc /proc procfs Jun 06 17:16 rw
/dev/hd10opt /opt jfs2 Jun 06 17:16 rw,log=/dev/hd8
/dev/fslv00 /XmRec jfs2 Jun 06 17:16 rw,log=/dev/hd8
/dev/fslv01 /tmp/m2 jfs2 Jun 06 17:16 rw,log=/dev/hd8
/dev/fslv02 /software jfs2 Jun 06 17:16 rw,log=/dev/hd8
/dev/oralv /opt/app/oracle jfs2 Jun 06 17:25 rw,log=/dev/hd8
/dev/db2lv /db2_database jfs2 Jun 06 19:54 rw,log=/dev/loglv00
/dev/fslv03 /bmc_home jfs2 Jun 07 12:11 rw,log=/dev/hd8
/dev/homepeter /home/peter jfs2 Jun 13 18:42 rw,log=/dev/hd8
/dev/bmclv /bcict/stage jfs2 Jun 15 15:21 rw,log=/dev/hd8
/dev/u01 /u01 jfs2 Jun 22 00:22 rw,log=/dev/loglv01
/dev/u02 /u02 jfs2 Jun 22 00:22 rw,log=/dev/loglv01
/dev/u05 /u05 jfs2 Jun 22 00:22 rw,log=/dev/loglv01
/dev/u03 /u03 jfs2 Jun 22 00:22 rw,log=/dev/loglv01
/dev/backuo /backup_ora jfs2 Jun 22 00:22 rw,log=/dev/loglv02
/dev/u02back /u02back jfs2 Jun 22 00:22 rw,log=/dev/loglv03
/dev/u01back /u01back jfs2 Jun 22 00:22 rw,log=/dev/loglv03
/dev/u05back /u05back jfs2 Jun 22 00:22 rw,log=/dev/loglv03
/dev/u04back /u04back jfs2 Jun 22 00:22 rw,log=/dev/loglv03
/dev/u03back /u03back jfs2 Jun 22 00:22 rw,log=/dev/loglv03
/dev/u04 /u04 jfs2 Jun 22 10:25 rw,log=/dev/loglv01

Example /etc/filesystems file:

/var:
dev = /dev/hd9var
vfs = jfs2
log = /dev/hd8
mount = automatic
check = false
type = bootfs
vol = /var
free = false

/tmp:
dev = /dev/hd3
vfs = jfs2
log = /dev/hd8
mount = automatic
check = false
vol = /tmp
free = false

/opt:
dev = /dev/hd10opt
vfs = jfs2
log = /dev/hd8
mount = true
check = true
vol = /opt
free = false

Example of the relation of Logigal Volumes and mountpoints:

/dev/lv01 = /u01
/dev/lv02 = /u02
/dev/lv03 = /u03
/dev/lv04 = /data
/dev/lv00 = /spl

2.2.5 mounting a CDROM:


=======================

AIX:
----
# mount -r -v cdrfs /dev/cd0 /cdrom

SuSE Linux:
-----------
# mount -t iso9660 /dev/cdrom /cdrom
# mount -t iso9660 /dev/cdrom /media/cdrom

Redhat Linux:
-------------
# mount -t iso9660 /dev/cdrom /media/cdrom

Solaris:
--------
# mount -r -F hsfs /dev/dsk/c0t6d0s2 /cdrom

HPUX:
-----

mount -F cdfs -o rr /dev/dsk/c1t2d0 /cdrom

Other commands on Linux:


------------------------

Sometimes on some Linux, and some scsi CDROM devices, you might try

# mount /dev/sr0 /mount_point


# mount -t iso9660 /dev/sr0 /mount_point

2.2.6 Some other commands related to mounts:


===========================================

fsstat command:
---------------

On some unixes, the fsstat command is available. It provides filesystem


statitstics.
It can take a lot of switches, thus be sure to check the man pages.

On Solaris, the following example shows the statistics for each file operation for
�/� (using the -f option):

$ fsstat -f /
Mountpoint: /
operation #ops bytes
open 8.54K
close 9.8K
read 43.6K 65.9M
write 1.57K 2.99M
ioctl 2.06K
setfl 4
getattr 40.3K
setattr 38
access 9.19K
lookup 203K
create 595
remove 56
link 0
rename 9
mkdir 19
rmdir 0
readdir 2.02K 2.27M
symlink 4
readlink 8.31K
fsync 199
inactive 2.96K
fid 0
rwlock 47.2K
rwunlock 47.2K
seek 29.1K
cmp 42.9K
frlock 4.45K
space 8
realvp 3.25K
getpage 104K
putpage 2.69K
map 13.2K
addmap 34.4K
delmap 33.4K
poll 287
dump 0
pathconf 54
pageio 0
dumpctl 0
dispose 23.8K
getsecattr 697
setsecattr 0
shrlock 0
vnevent 0

fuser command:
--------------

AIX:

Purpose
Identifies processes using a file or file structure.

Syntax
fuser [ -c | -d | -f ] [ -k ] [ -u ] [ -x ] [ -V ]File ...

Description
The fuser command lists the process numbers of local processes that use the local
or remote files
specified by the File parameter. For block special devices, the command lists the
processes that use
any file on that device.

Flags

-c Reports on any open files in the file system containing File.


-d Implies the use of the -c and -x flags. Reports on any open files which haved
been unlinked from the file system
(deleted from the parent directory). When used in conjunction with the -V flag,
it also reports the inode number
and size of the deleted file.
-f Reports on open instances of File only.
-k Sends the SIGKILL signal to each local process. Only the root user can kill a
process of another user.
-u Provides the login name for local processes in parentheses after the process
number.
-V Provides verbose output.
-x Used in conjunction with -c or -f, reports on executable and loadable objects
in addition to the standard fuser output.

To list the process numbers of local processes using the /etc/passwd file, enter:
# fuser /etc/passwd

To list the process numbers and user login names of processes using the
/etc/filesystems file, enter:
# fuser -u /etc/filesystems

To terminate all of the processes using a given file system, enter:


#fuser -k -x -u /dev/hd1 -OR-
#fuser -kxuc /home

Either command lists the process number and user name, and then terminates each
process that is using
the /dev/hd1 (/home) file system. Only the root user can terminate processes that
belong to another user.
You might want to use this command if you are trying to unmount the /dev/hd1 file
system and a process
that is accessing the /dev/hd1 file system prevents this.

To list all processes that are using a file which has been deleted from a given
file system, enter:
# fuser -d /usr

Examples on linux distro's:

- To kill all processes accessing the file system /home in any way.
# fuser -km /home

- invokes something if no other process is using /dev/ttyS1.


if fuser -s /dev/ttyS1; then :; else something; fi

- shows all processes at the (local) TELNET port.


# fuser telnet/tcp

A similar command is the lsof command.

2.2.7 Starting and stopping NFS:


================================

Short note on stopping and starting NFS. See other sections for more detail.

On all unixes, a number of daemons should be running in order for NFS to be


functional, like for example
the rpc.* processes, biod, nfsd and others.

Once nfs is running, and in order to actually "share" or "export" your filesystem
on your server, so remote clients
are able to mount the nfs mount, in most cases you should edit the "/etc/exports"
file.
See other sections in this document (search on exportfs) on how to accomplish
this.

-- AIX:

The following subsystems are part of the nfs group: nfsd, biod, rpc.lockd,
rpc.statd, and rpc.mountd.
The nfs subsystem (group) is under control of the "resource controller", so
starting and stopping nfs
is actually easy

# startsrc -g nfs
# stopsrc -g nfs

Or use smitty.

-- Redhat Linux:
# /sbin/service nfs restart
# /sbin/service nfs start
# /sbin/service nfs stop

-- On some other Linux distros


# /etc/init.d/nfs start
# /etc/init.d/nfs stop
# /etc/init.d/nfs restart

-- Solaris:
If the nfs daemons aren't running, then you will need to run:
# /etc/init.d/nfs.server start

-- HP-UX:
Issue the following command on the NFS server to start all the necessary NFS
processes (HP):
# /sbin/init.d/nfs.server start

Or if your machine is only a client:

# cd /sbin/init.d
# ./nfs.client start

===========================================
3. Change ownership file/dir, adding users:
===========================================

3.1 Changing ownership:


-----------------------

chown -R user[:group] file/dir (SVR4)


chown -R user[.group] file/dir (bsd)

(-R recursive dirs)


Examples:
chown -R oracle:oinstall /opt/u01
chown -R oracle:oinstall /opt/u02
chown -R oracle:oinstall /opt/u03
chown -R oracle:oinstall /opt/u04

-R means all subdirs also.

chown rjanssen file.txt - Give permissions as owner to user rjanssen.

# groupadd dba
# useradd oracle
# mkdir /usr/oracle
# mkdir /usr/oracle/9.0
# chown -R oracle:dba /usr/oracle
# touch /etc/oratab
# chown oracle:dba /etc/oratab

Note: Not owner message:


------------------------

>>> Solaris:

it is possible to turn the chown command on or off (i.e., allow it to be used or


disallow its use) on a system by
altering the /etc/system file. The /etc/system file, along with the files in
/etc/default should be thought of a
"system policy files" -- files that allow the systems administrator to determine
such things as whether
root can login over the network, whether su commands are logged, and whether a
regular user can change ownership of his own files.

On a system disallowing a user to change ownership of his files (this is now the
default), the value of rstchown is set to 1.
Think of this as saying "restrict chown is set to TRUE". You might see a line like
this in /etc/system (or no rstchown value at all):

set rstchown=1

On a system allowing chown by regular users, this value will be set to 0 as shown
here:

set rstchown=0

Whenever the /etc/system file is changed, the system will have to be rebooted for
the changes to take effect.
Since there is no daemon process associated with commands such a chown, there is
no process that one could send
a hangup (HUP) to effect the change in policy "on the fly".

Why might system administrators restrict access to the chown command? For a system
on which disk quotas are enforced,
they might not want to allow files to be "assigned" by one user to another user's
quota. More importantly,
for a system on which accountability is deemed important, system administrators
will want to know who
created each file on a system - whether to track down a potential system abuse or
simply to ask if a file that is
occupying space in a shared directory or in /tmp can be removed.

When a system disallows use of the chown command, you can expect to see dialog
like this:

% chown wallace myfile


chown: xyz: Not owner

Though it would be possible to disallow "chowning" of files by changing


permissions on /usr/bin/chown,
such a change would not slow down most Unix users. They would simple copy the
/usr/bin/chown file to their own directory
and make their copy executable. Designed to be extensible, Unix will happily
comply. Making the change in the /etc/system
file blocks any chown operation from taking effect, regardless of where the
executable is stored, who owns it,
and what it is called. If usage of chown is restricted in /etc/system, only the
superuser can change ownership of files.

3.2 Add a user in Solaris:


--------------------------

Examples:

# useradd -u 3000 -g other -d /export/home/tempusr -m -s /bin/ksh -c "temporary


user" tempusr
# useradd -u 1002 -g dba -d /export/home/avdsel -m -s /bin/ksh -c "Albert van der
Sel" avdsel
# useradd -u 1001 -g oinstall -G dba -d /export/home/oraclown -m -s /bin/ksh -c
"Oracle owner" oraclown
# useradd -u 1005 -g oinstall -G dba -d /export/home/brighta -m -s /bin/ksh -c
"Bright Alley" brighta

useradd -u 300 -g staff -G staff -d /home/emc -m -s /usr/bin/ksh -c "EMC user" emc

a password cannot be specified using the useradd command.


Use passwd to give the user a password:

# passwd tempusr

UID must be unique and is typically a number between 100 and 60002
GID is a number between 0 and 60002

3.3 Add a user in AIX:


----------------------

You can also use the useradd command, just as in Solaris.


Or use the native "mkuser" command.

# mkuser albert

The mkuser command does not create password information for a user. It initializes
the password field
with an * (asterisk). Later, this field is set with the passwd or pwdadm command.
New accounts are disabled until the passwd or pwdadm commands are used to add
authentication
information to the /etc/security/passwd file.

You can use the Users application in Web-based System Manager to change user
characteristics. You could also
use the System Management Interface Tool (SMIT) "smit mkuser" fast path to run
this command.

The /usr/lib/security/mkuser.default file contains the default attributes for new


users.
This file is an ASCII file that contains user stanzas. These stanzas have
attribute default values
for users created by the mkuser command. Each attribute has the Attribute=Value
form. If an attribute
has a value of $USER, the mkuser command substitutes the name of the user. The end
of each attribute pair
and stanza is marked by a new-line character.

There are two stanzas, user and admin, that can contain all defined attributes
except the id and admin attributes.
The mkuser command generates a unique id attribute. The admin attribute depends on
whether the -a flag is used with
the mkuser command.

A typical user stanza looks like the following:

user:
pgroup = staff
groups = staff
shell = /usr/bin/ksh
home = /home/$USER
auth1 = SYSTEM

# mkuser [ -de | -sr ] [-attr Attributes=Value [ Attribute=Value... ] ] Name


# mkuser [ -R load_module ] [ -a ] [ Attribute=Value ... ] Name

To create the davis user account with the default values in the
/usr/lib/security/mkuser.default file, type:
# mkuser davis

To create the davis account with davis as an administrator, type:


# mkuser -a davis

Only the root user or users with the UserAdmin authorization can create davis as
an administrative user.
To create the davis user account and set the su attribute to a value of false,
type:
# mkuser su=false davis

To create the davis user account that is identified and authenticated through the
LDAP load module, type:
# mkuser -R LDAP davis

To add davis to the groups finance and accounting, enter:


chuser groups=finance,accounting davis

-- Add a user with the smit utility:


-- ---------------------------------
Start SMIT by entering

smit <Enter>

From the Main Menu, make the following selections:

-Security and Users


-Users
-Add a User to the System

The utility displays a form for adding new user information. Use the <Up-arrow>
and <Down-arrow> keys to move through
the form. Do not use <Enter> until you are finished and ready to exit the screen.
Fill in the appropriate fields of the Create User form (as listed in Create User
Form) and press <Enter>.
The utility exits the form and creates the new user.

-- Using SMIT to Create a Group:


-- -----------------------------
Use the following procedure to create a group.

Start SMIT by entering the following command:

smit <Enter>

The utility displays the Main Menu.

From the Main Menu, make the following selections:

-Security and Users


-Users
-Add a Group to the System

The utility displays a form for adding new group information.


Type the group name in the Group Name field and press <Enter>.
The group name must be eight characters or less.
The utility creates the new group, automatically assigns the next available GID,
and exits the form

Primary Authentication method of system:


----------------------------------------
To check whether root has a primary authentication method of SYSTEM, use the
following command:
# lsuser -a auth1 root

If needed, change the value by using


# chuser auth1=SYSTEM root

3.4 Add a user in HP-UX:


------------------------

-- Example 1:

Add user john to the system with all of the default attributes.

# useradd john

Add the user john to the system with a UID of 222 and a primary group
of staff.

# useradd -u 222 -g staff john

-- Example 2:

=> Add a user called guestuser as per following requirements


=> Primary group member of guests
=> Secondary group member of www and accounting
=> Shell must be /usr/bin/bash3
=> Home directory must be /home/guestuser

# useradd -g guests -G www,accounting -d /home/guests -s /home/guestuser/ -m


guestuser
# passwd guestuser

3.5 Add a user in Linux Redhat:


-------------------------------

You can use tools like useradd or groupadd to create new users and groups from the
shell prompt.
But an easier way to manage users and groups is through the graphical application,
User Manager.

Users are described in the /etc/passwd file


Groups are stored on Red Hat Linux in the /etc/group file.

Or invoke the Gnome Linuxconf GUI Tool by typing "linuxconf". In Red Hat Linux,
linuxconf is found in the
/bin directory.

================================
4. Change filemode, permissions:
================================

Permissions are given to:


u = user
g = group
o = other/world
a = all

file/directory permissions (or also called "filemodes") are:


r = read
w = write
x = execute

special modes are:


X = sets execute if already set (this one is particularly sexy, look below)
s = set setuid/setgid bit
t = set sticky bit

Examples:
---------

readable by all, everyone


% chmod a+r essay.001

to remove read write and execute permissions on the file biglist for the group and
others
% chmod go-rwx biglist

make executable:
% chmod +x mycommand

set mode:
% chmod 644 filename

rwxrwxrwx=777
rw-rw-rw-=666
rw-r--r--=644 corresponds to umask 022
r-xr-xr-x=555
rwxrwxr-x=775

1 = execute
2 = write
4 = read

note that the total is 7


execute and read are: 1+4=5
read and write are: 2+4=6
read, write and exec: 1+2+4=7
and so on

directories must always be executable...

so a file with, say 640, means, the owner can read and write (4+2=6), the group
can read (4)
and everyone else has no permission to use the file (0).
chmod -R a+X .
This command would set the executable bit (for all users) of all directories and
executables
below the current directory that presently have an execute bit set. Very helpful
when you want to set
all your binary files executable for everyone other than you without having to set
the executable bit
of all your conf files, for instance. *wink*

chmod -R g+w .
This command would set all the contents below the current directory writable by
your current group.

chmod -R go-rwx
This command would remove permissions for group and world users without changing
the bits for the file owner.
Now you don't have to worry that 'find . -type f -exec chmod 600 {}\;' will change
your binary files
non-executable. Further, you don't need to run an additional command to chmod your
directories.

chmod u+s /usr/bin/run_me_setuid


This command would set the setuid bit of the file. It's simply easier than
remembering which number to use
when wanting to setuid/setgid, IMHO.

========================
5. About the sticky bit:
========================

- This info is valid for most Unix OS including Solaris and AIX:
----------------------------------------------------------------

A 't' or 'T' as the last character of the "ls -l" mode characters
indicates that the "sticky" (save text image) bit is set. See ls(1) for
an explanation the distinction between 't' and 'T'.

The sticky bit has a different meaning, depending on the type of file it
is set on...

sticky bit on directories


-------------------------
[From chmod(2)]
If the mode bit S_ISVTX (sticky bit) is set on a directory, files
inside the directory may be renamed or removed only by the owner of
the file, the owner of the directory, or the superuser (even if the
modes of the directory would otherwise allow such an operation).

[Example]
drwxrwxrwt 104 bin bin 14336 Jun 7 00:59 /tmp

Only root is permitted to turn the sticky bit on or off. In addition the sticky
bit applies to anyone
who accesses the file. The syntax for setting the sticky bit on a dir /foo
directory is as follows:

chmod +t /foo

sticky bit on regular files


---------------------------
[From chmod(2)]
If an executable file is prepared for sharing, mode bit S_ISVTX prevents
the system from abandoning the swap-space image of the program-text
portion of the file when its last user terminates. Then, when the next
user of the file executes it, the text need not be read from the file
system but can simply be swapped in, thus saving time.

[From HP-UX Kernel Tuning and Performance Guide]


Local paging. When applications are located remotely, set the "sticky
bit"
on the applications binaries, using the chmod +t command. This tells the
system to page the text to the local disk. Otherwise, it is "retrieved"
across the network. Of course, this would only apply when there is actual
paging occurring. More recently, there is a kernel parameter,
page_text_to_local, which when set to 1, will tell the kernel to page all
NFS executable text pages to local swap space.

[Example]
-r-xr-xr-t 6 bin bin 24111111111664 Nov 14 2000
/usr/bin/vi

Solaris:
--------

The sticky bit on a directory is a permission bit that protects files within that
directory.
If the directory has the sticky bit set, only the owner of the file, the owner of
the directory,
or root can delete the file. The sticky bit prevents a user from deleting other
users' files from
public directories, such as uucppublic:

castle% ls -l /var/spool/uucppublic
drwxrwxrwt 2 uucp uucp 512 Sep 10 18:06 uucppublic
castle%

When you set up a public directory on a TMPFS temporary file system, make sure
that you set the sticky bit manually.

You can set sticky bit permissions by using the chmod command to assign the octal
value 1 as the first number
in a series of four octal values. Use the following steps to set the sticky bit on
a directory:

1. If you are not the owner of the file or directory, become superuser.
2. Type chmod <1nnn> <filename> and press Return.
3. Type ls -l <filename> and press Return to verify that the permissions of the
file have changed.
The following example sets the sticky bit permission on the pubdir directory:
castle% chmod 1777 pubdir
castle% ls -l pubdir
drwxrwxrwt 2 winsor staff 512 Jul 15 21:23 pubdir
castle%

================
6. About SETUID:
================

Each process has three user ID's:


the real user ID (ruid)
the effective user ID (euid) and
the saved user ID (suid)

The real user ID identifies the owner of the process, the effective uid is used in
most
access control decisions, and the saved uid stores a previous user ID so that it
can be restored later.
Similar, a process has three group ID's.

When a process is created by fork, it inherits the three uid's from the parent
process.
When a process executes a new file by exec..., it keeps its three uid's unless the
set-user-ID bit of the new file is set, in which case the effective uid and saved
uid
are assigned the user ID of the owner of the new file.

When setuid (set-user identification) permission is set on an executable file, a


process that runs this file
is granted access based on the owner of the file (usually root), rather than the
user who created the process.
This permission enables a user to access files and directories that are normally
available only to the owner.

The setuid permission is shown as an s in the file permissions.


For example, the setuid permission on the passwd command enables a user to change
passwords,
assuming the permissions of the root ID are the following:

castle% ls -l /usr/bin/passwd
-r-sr-sr-x 3 root sys 96796 Jul 15 21:23 /usr/bin/passwd
castle%

You setuid permissions by using the chmod command to assign the octal value 4 as
the first number
in a series of four octal values. Use the following steps to setuid permissions:

1. If you are not the owner of the file or directory, become superuser.
2. Type chmod <4nnn> <filename> and press Return.
3. Type ls -l <filename> and press Return to verify that the permissions of the
file have changed.

The following example sets setuid permission on the myprog file:

#chmod 4555 myprog


-r-sr-xr-x 1 winsor staff 12796 Jul 15 21:23 myprog
#

The setgid (set-group identification) permission is similar to setuid, except that


the effective group ID
for the process is changed to the group owner of the file and a user is granted
access based on permissions
granted to that group. The /usr/bin/mail program has setgid permissions:

castle% ls -l /usr/bin/mail
-r-x�s�x 1 bin mail 64376 Jul 15 21:27 /usr/bin/mail
castle%

When setgid permission is applied to a directory, files subsequently created in


the directory belong to the group
the directory belongs to, not to the group the creating process belongs to. Any
user who has write permission
in the directory can create a file there; however, the file does not belong to the
group of the user,
but instead belongs to the group of the directory.

You can set setgid permissions by using the chmod command to assign the octal
value 2 as the first number
in a series of four octal values. Use the following steps to set setgid
permissions:

1. If you are not the owner of the file or directory, become superuser.
2. Type chmod <2nnn> <filename> and press Return.
3. Type ls -l <filename> and press Return to verify that the permissions of the
file have changed.
The following example sets setuid permission on the myprog2 file:

#chmod 2551 myprog2


#ls -l myprog2
-r-xr-s�x 1 winsor staff 26876 Jul 15 21:23 myprog2
#

=========================
7. Find command examples:
=========================

Introduction
The find command allows the Unix user to process a set of files and/or directories
in a file subtree.

You can specify the following:

where to search (pathname)


what type of file to search for (-type: directories, data files, links)
how to process the files (-exec: run a process against a selected file)
the name of the file(s) (-name)
perform logical operations on selections (-o and -a)
Search for file with a specific name in a set of files (-name)

EXAMPLES
--------

# find . -name "rc.conf" -print

This command will search in the current directory and all sub directories for a
file named rc.conf.
Note: The -print option will print out the path of any file that is found with
that name. In general -print wil
print out the path of any file that meets the find criteria.

# find . -name "rc.conf" -exec chmod o+r '{}' \;

This command will search in the current directory and all sub directories. All
files named rc.conf will be processed
by the chmod -o+r command. The argument '{}' inserts each found file into the
chmod command line.
The \; argument indicates the exec command line has ended.
The end results of this command is all rc.conf files have the other permissions
set to read access
(if the operator is the owner of the file).

# find . -exec grep "www.athabasca" '{}' \; -print

This command will search in the current directory and all sub directories.
All files that contain the string will have their path printed to standard output.

# find / -xdev -size +2048 -ls | sort -r +6

This command will find all files in the root directory larger than 1 MB.

# find . -exec grep "CI_ADJ_TYPE" {} \; -print

This command search all subdirs all files to find text CI_ADJ_TYPE

Other examples:
---------------
# find . -name file -print
# find / -name $1 -exec ls -l {} \;

# find / -user nep -exec ls -l {} \; >nepfiles.txt


In English: search from the root directory for any files owned by nep
and execute an ls -l on the file when any are found.
Capture all output in nepfiles.txt.

# find $HOME -name \*.txt -print


In order to protect the asterisk from being expanded by the shell,
it is necessary to use a backslash to escape the asterisk as in:

# find / -atime +30 -print


This prints files that have not been accessed in the last 30 days

# find / -atime +100 -size +500000c -print


The find search criteria can be combined. This command will locate and list all
files
that were last accessed more than 100 days ago, and whose size exceeds 500,000
bytes.
# find /opt/bene/process/logs -name 'ALBRACHT*' -mtime +90 -exec rm {} \;

# find /example /new/example -exec grep -l 'Where are you' {} \;


# find / \( -name a.out -o -name '*.o' \) -atime +7 -exec rm {} \;
# find . -name '*.trc' -mtime +3 -exec rm {} \;
# find / -fsonly hfs -print
# cd /; find . ! -path ./Disk -only -print | cpio -pdxm /Disk
# cd /; find . -path ./Disk -prune -o -print | cpio -pdxm /Disk
# cd /; find . -xdev -print | cpio -pdm /Disk
# find -type f -print | xargs chmod 444
# find -type d -print | xargs chmod 555
# find . -atime +1 -name '*' -exec rm -f {} \;
# find /tmp -atime +1 -name '*' -exec rm -f {} \;
# find /usr/tmp -atime +1 -name '*' -exec rm -f {} \;
# find / -name core -exec rm -f {} \;
# find . -name "*.dbf" -mtime -2 -exec ls {} \;

* Search and list all files from current directory and down for the string ABC:
find ./ -name "*" -exec grep -H ABC {} \;
find ./ -type f -print | xargs grep -H "ABC" /dev/null
egrep -r ABC *
* Find all files of a given type from current directory on down:
find ./ -name "*.conf" -print
* Find all user files larger than 5Mb:
find /home -size +5000000c -print
* Find all files owned by a user (defined by user id number. see /etc/passwd) on
the system: (could take a very long time)
find / -user 501 -print
* Find all files created or updated in the last five minutes: (Great for finding
effects of make install)
find / -cmin -5
* Find all users in group 20 and change them to group 102: (execute as root)
find / -group 20 -exec chown :102 {} \;
* Find all suid and setgid executables:
find / \( -perm -4000 -o -perm -2000 \) -type f -exec ls -ldb {} \;
find / -type f -perm +6000 -ls

Example:
--------

cd /database/oradata/pegacc/archive
archdir=`pwd`
if [ $archdir=="/database/oradata/pegacc/archive" ]
then
find . -name "*.dbf" -mtime +5 -exec rm {} \;
else
echo "error in onderhoud PEGACC archives" >>
/opt/app/oracle/admin/log/archmaint.log
fi

Example:
--------

The following example shows how to find files larger than 400 blocks in the
current directory:

# find . -size +400 -print

REAL COOL EXAMPLE:


------------------

This example could even help in recovery of a file:

In some rare cases a strangely-named file will show itself in your directory and
appear to be
un-removable with the rm command. Here is will the use of ls -li and find with its
-inum [inode]
primary does the job.
Let's say that ls -l shows your irremovable as

-rw------- 1 smith smith 0 Feb 1 09:22 ?*?*P

Type:

ls -li

to get the index node, or inode.

153805 -rw------- 1 smith smith 0 Feb 1 09:22 ?*?^P

The inode for this file is 153805. Use find -inum [inode] to make sure that the
file is correctly identified.

% find -inum 153805 -print


./?*?*P

Here, we see that it is. Then used the -exec functionality to do the remove. .

% find . -inum 153805 -print -exec /bin/rm {} \;

Note that if this strangely named file were not of zero-length, it might contain
accidentally misplaced
and wanted data. Then you might want to determine what kind of data the file
contains and move the file
to some temporary directory for further investigation, for example:

% find . -inum 153805 -print -exec /bin/mv {} unknown.file \;

Will rename the file to unknown.file, so you can easily inspect it.

Note: difference betweeen mtime and atime:


------------------------------------------

In using the find command where you want to delete files older than a certain
date, you can use
commands like
find . -name "*.log" -mtime +30 -exec rm {} \; or
find . -name "*.dbf" -atime +30 -exec rm {} \;
Why should you choose, or not choose, between atime and mtime?

It is important to distinguish between a file or directory's change time (ctime),


access time (atime),
and modify time (mtime).

ctime -- In UNIX, it is not possible to tell the actual creation time of a file.
The ctime--change time--
is the time when changes were made to the file or directory's inode
(owner, permissions, etc.).
The ctime is also updated when the contents of a file change. It is
needed by the dump command
to determine if the file needs to be backed up. You can view the ctime
with the ls -lc command.

atime -- The atime--access time--is the time when the data of a file was last
accessed. Displaying the contents
of a file or executing a shell script will update a file's atime, for
example.

mtime -- The mtime--modify time--is the time when the actual contents of a file
was last modified.
This is the time displayed in a long directoring listing (ls -l).

Thats why backup utilities use the mtime when performing incremental backups:
When the utility reads the data for a file that is to be included in a backup, it
does not
affect the file's modification time, but it does affect the file's access time.

So for most practical reasons, if you want to delete logfiles (or other files)
older than a certain
date, its best to use the mtime attribute.

How to make those times visible?

"ls -l" shows atime


"ls -lc" shows ctime
"ls -lm" shows mtime

"istat filename" will show all three.

pago-am1:/usr/local/bb>istat bb18b3.tar.gz
Inode 20 on device 10/9 File
Protection: rw-r--r--
Owner: 100(bb) Group: 100(bb)
Link count: 1 Length 427247 bytes

Last updated: Tue Aug 14 11:01:46 2001


Last modified: Thu Jun 21 07:36:32 2001
Last accessed: Thu Nov 01 20:38:46 2001

===================
7. Crontab command:
===================

Cron is uded to schedule or run periodically all sorts of executable programs or


shell scripts,
like backupruns, housekeeping jobs etc..
The crond daemon makes it all happen.

Who has access to cron, is on most unixes determined by the "cron.allow" and
"cron.deny" files.
Every allowed user, can have it's own "crontab" file.
The crontab of root, is typically used for system administrative jobs.

On most unixes the relevant files can be found in:


/var/spool/cron/crontabs or
/var/adm/cron or
/etc/cron.d

For example, on Solaris the /var/adm/cron/cron.allow and /var/adm/cron/cron.deny


files control
which users can use the crontab command.

Most common usage:

- if you just want a listing: crontab -l


- if you want to edit and change: crontab -e

crontab [ -e | -l | -r | -v | File ]

-e: edit, submit -r remove, -l list

A crontab file contains entries for each cron job. Entries are separated by
newline characters.
Each crontab file entry contains six fields separated by spaces or tabs in the
following form:

minute hour day_of_month month weekday command

0 0 * 8 * /u/harry/bin/maintenance

Notes:
------

Note 1: start and stop cron:


----------------------------

-- Solaris and some other unixes:

The proper way to stop and restart cron are:

# /etc/init.d/cron stop
# /etc/init.d/cron start

In Solaris 10 you could use the following command as well:


# svcadm refresh cron
# svcadm restart cron
-- Other way to restart cron:

In most unixes, cron is started by init and there is a record in the /etc/initab
file
which makes that happen. Check if your system has indeed a record of cron in the
inittab file.
The type of start should be "respawn", which means that should the
superuser do a "kill -9 crond", the cron daemon is simply restarted again.
Again, preferrably, there should be a stop and start script to restart cron.

Especially on AIX, there is no true way to restart cron in a neat way. Not via the
Recourse Control startscr command,
or script, a standard method is available. Just kill crond and it will be
restarted.

-- On many linux distros:

to restart the cron daemon, you could do either a "service crond restart" or a
"service
crond reload".

Note 2:
-------

Create a cronjobs file


You can do this on your local computer in Notepad or you can create the file
directly on
your Virtual Server using your favorite UNIX text editor (pico, vi, etc).
Your file should contain the following entries:

MAILTO="[email protected]"
0 1 1 1-12/3 * /usr/local/bin/vnukelog

This will run the command "/usr/local/bin/vnukelog" (which clears all of your log
files) at
1 AM on the first day of the first month of every quarter, or January, April,
July, and October (1-12/3).
Obviously, you will need to substitute a valid e-mail address in the place of
"[email protected]".

If you have created this file on your local computer,


FTP the file up to your Virtual Server and store it in your home directory under
the name
"cronjobs" (you can actually use any name you would like).

Register your cronjobs file with the system


After you have created your cronjobs file (and have uploaded it to your Virtual
Server if applicable),
you need to Telnet to your server and register the file with the cron system
daemon. To do this, simply type:
crontab cronjobs

Or if you used a name other than "cronjobs", substitute the name you selected for
the occurrence of "cronjobs" above.
Note 3:
-------
# use /bin/sh to run commands, no matter what /etc/passwd says
SHELL=/bin/sh
# mail any output to `paul', no matter whose crontab this is
MAILTO=paul
#
# run five minutes after midnight, every day
5 6-18 * * * /opt/app/oracle/admin/scripts/grepora.sh
# run at 2:15pm on the first of every month -- output mailed to paul
15 14 1 * * $HOME/bin/monthly
# run at 10 pm on weekdays, annoy Joe
0 22 * * 1-5 mail -s "It's 10pm" joe%Joe,%%Where are your kids?%
23 0-23/2 * * * echo "run 23 minutes after midn, 2am, 4am ..., everyday"
5 4 * * sun echo "run at 5 after 4 every sunday"

2>&1 means:

It means that standard error is redirected along with standard output. Standard
error
could be redirected to a different file, like
ls > toto.txt 2> error.txt If your shell is csh or tcsh, you would redirect
standard
output and standard error like this
lt >& toto.txt Csh or tcsh cannot redirect standard error separately.

Note 4:
-------

thread

Q:

> Isn't there a way to refresh cron to pick up changes made using
> crontab -e? I made the changes but the specified jobs did not run.
> I'm thinking I need to refresh cron to pick up the changes. Is this
> true? Thanks.

A:

Crontab -e should do that for you, that's the whole point of using
it rather than editing the file yourself.
Why do you think the job didn't run?
Post the crontab entry and the script. Give details of the version of
Tru64 and the patch level.
Then perhaps we can help you to figure out the real cause of the problem.
Hope this helps

A:

I have seen the following problem when editing the cron file for another
user:

crontab -e idxxxxxx

This changed the control file,


when I verified with crontab -l the contents was correctly shown,
but the cron daemon did not execute the new contents.

To solve the problem, I needed to follow the following commands:

su - idxxxxxx
crontab -l |crontab

This seems to work ... since then I prefer the following

su - idxxxxxx
crontab -e

which seems to work also ...

Note 5:
-------

On AIX it is observed, that if the "daemon=" attribute of a user is set to be


false,
this user cannot use crontab, even if the account is placed in cron.allow.

You need to set the attribute to "daemon=true".

* daemon Defines whether the user can execute programs using the system
* resource controller (SRC). Possible values: true or false.

Note 6:
-------

If you want to quick test the crontab of a user:

su - user
and put the following in the crontab of that user:

* * * * * date >/tmp/elog

===========================
8. Job control, background:
===========================

To put a sort job (or other job) in background:


# sort < foo > bar &

To show jobs:
# jobs

To show processes:
# ps
# ps -ef | grep ora

Job in foreground -> background:


Ctrl-Z (suspend)
#bg or bg jobID
Job in background -> foreground:
# fg %jobid

Stop a process:
# kill -9 3535 (3535 is the pid, process id)

Stop a background process you may try this:


# kill -QUIT 3421

-- Kill all processes of a specific users:


-- ---------------------------------------

To kill all processes of a specific user, enter:


# ps -u [user-id] -o pid | grep -v PID | xargs kill -9

Another way:
Use who to check out your current users and their terminals. Kill all processes
related to a specific terminal:
# fuser -k /dev/pts[#]

Yet another method:


Su to the user-id you wish to kill all processes of and enter:
# su - [user-id] -c kill -9 -1

Or su - to that userid, and use the killall command, which is available on most
unix'es, like for example AIX.
# killall

The nohup command:


------------------

When working with the UNIX operating system, there will be times when you will
want to run commands that are immune
to log outs or unplanned login session terminations. This is especially true for
UNIX system administrators.
The UNIX command for handling this job is the nohup (no hangup) command.

Normally when you log out, or your session terminates unexpectedly, the system
will kill all processes you have started.
Starting a command with nohup counters this by arranging for all stopped, running,
and background jobs to ignore
the SIGHUP signal.

The syntax for nohup is:


nohup command [arguments]

You may optionally add an ampersand to the end of the command line to run the job
in the background:
nohup command [arguments] &

If you do not redirect output from a process kicked off with nohup, both standard
output (stdout) and
standard error (stderr) are sent to a file named nohup.out. This file will be
created in $HOME (your home directory)
if it cannot be created in the working directory. Real-time monitoring of what is
being written to nohup.out
can be accomplished with the "tail -f nohup.out" command.
Although the nohup command is extremely valuable to UNIX system administrators, it
is also a must-know tool
for others who run lengthy or critical processes on UNIX systems

The nohup command runs the command specified by the Command parameter and any
related Arg parameters,
ignoring all hangup (SIGHUP) signals. Use the nohup command to run programs in the
background after logging off.
To run a nohup command in the background, add an & (ampersand) to the end of the
command.

Whether or not the nohup command output is redirected to a terminal, the output is
appended to the nohup.out file
in the current directory. If the nohup.out file is not writable in the current
directory, the output is redirected
to the $HOME/nohup.out file. If neither file can be created nor opened for
appending, the command specified
by the Command parameter is not invoked. If the standard error is a terminal, all
output written by the
named command to its standard error is redirected to the same file descriptor as
the standard output.

To run a command in the background after you log off, enter:


$ nohup find / -print &

After you enter this command, the following is displayed:


670
$ Sending output to nohup.out
The process ID number changes to that of the background process started by &
(ampersand). The message Sending
output to nohup.out informs you that the output from the find / -print command is
in the nohup.out file.
You can log off after you see these messages, even if the find command is still
running.

Example of ps -ef on a AIX5 system:

[LP 1]root@ol16u209:ps -ef


UID PID PPID C STIME TTY TIME CMD
root 1 0 0 Oct 17 - 0:00 /etc/init
root 4198 1 0 Oct 17 - 0:00 /usr/lib/errdemon
root 5808 1 0 Oct 17 - 1:15 /usr/sbin/syncd 60
oracle 6880 1 0 10:27:26 - 0:00 ora_lgwr_SPLDEV1
root 6966 1 0 Oct 17 - 0:00 /usr/ccs/bin/shlap
root 7942 43364 0 Oct 17 - 0:00 sendmail: accepting connections
alberts 9036 9864 0 20:41:49 - 0:00 sshd: alberts@pts/0
root 9864 44426 0 20:40:21 - 0:00 sshd: alberts [priv]
root 27272 36280 1 20:48:03 pts/0 0:00 ps -ef
oracle 27856 1 0 10:27:26 - 0:01 ora_smon_SPLDEV1
oracle 31738 1 0 10:27:26 - 0:00 ora_dbw0_SPLDEV1
oracle 31756 1 0 10:27:26 - 0:00 ora_reco_SPLDEV1
alberts 32542 9036 0 20:41:49 pts/0 0:00 -ksh
maestro 33480 34394 0 05:59:45 - 0:00 /prj/maestro/maestro/bin/batchman
-parm 32000
root 34232 33480 0 05:59:45 - 0:00 /prj/maestro/maestro/bin/jobman
maestro 34394 45436 0 05:59:45 - 0:00 /prj/maestro/maestro/bin/mailman
-parm 32000 -- 2002 OL16U209 CONMAN UNIX 6.
root 34708 1 0 13:55:51 lft0 0:00 /usr/sbin/getty /dev/console
oracle 35364 1 0 10:27:26 - 0:01 ora_cjq0_SPLDEV1
oracle 35660 1 0 10:27:26 - 0:04 ora_pmon_SPLDEV1
root 36280 32542 0 20:45:06 pts/0 0:00 -ksh
root 36382 43364 0 Oct 17 - 0:00 /usr/sbin/rsct/bin/IBM.ServiceRMd
root 36642 43364 0 Oct 17 - 0:00 /usr/sbin/rsct/bin/IBM.CSMAgentRMd
root 36912 43364 0 Oct 17 - 0:03 /usr/opt/ifor/bin/i4lmd -l
/var/ifor/logdb -n clwts
root 37186 43364 0 Oct 17 - 0:00 /etc/ncs/llbd
root 37434 43364 0 Oct 17 - 0:17 /usr/opt/ifor/bin/i4llmd -b -n
wcclwts -l /var/ifor/llmlg
root 37738 37434 0 Oct 17 - 0:00 /usr/opt/ifor/bin/i4llmd -b -n
wcclwts -l /var/ifor/llmlg
root 37946 1 0 Oct 17 - 0:00 /opt/hitachi/HNTRLib2/bin/hntr2mon
-d
oracle 38194 1 0 Oct 17 - 0:00
/prj/oracle/product/9.2.0.3/bin/tnslsnr LISTENER -inherit
root 38468 43364 0 Oct 17 - 0:00 /usr/sbin/rsct/bin/IBM.AuditRMd
root 38716 1 0 Oct 17 - 0:00 /usr/bin/itesmdem itesrv.ini
/etc/IMNSearch/search/
imnadm 39220 1 0 Oct 17 - 0:00 /usr/IMNSearch/httpdlite/httpdlite
-r /etc/IMNSearch/httpdlite/httpdlite.con
root 39504 36912 0 Oct 17 - 0:00 /usr/opt/ifor/bin/i4lmd -l
/var/ifor/logdb -n clwts
root 39738 43364 0 Oct 17 - 0:01 /usr/DynamicLinkManager/bin/dlmmgr
root 40512 43364 0 Oct 17 - 0:01 /usr/sbin/rsct/bin/rmcd -r
root 40784 43364 0 Oct 17 - 0:00 /usr/sbin/rsct/bin/IBM.ERrmd
root 41062 1 0 Oct 17 - 0:00 /usr/sbin/cron
was 41306 1 0 Oct 17 - 2:10 /prj/was/java/bin/java -Xmx256m
-Dwas.status.socket=32776 -Xms50m -Xbootclas
oracle 42400 1 0 10:27:26 - 0:02 ora_ckpt_SPLDEV1
root 42838 1 0 Oct 17 - 0:00 /usr/sbin/uprintfd
root 43226 43364 0 Oct 17 - 0:00 /usr/sbin/nfsd 3891
root 43364 1 0 Oct 17 - 0:00 /usr/sbin/srcmstr
root 43920 43364 0 Oct 17 - 0:00 /usr/sbin/aixmibd
root 44426 43364 0 Oct 17 - 0:00 /usr/sbin/sshd -D
root 44668 43364 0 Oct 17 - 0:00 /usr/sbin/portmap
root 44942 43364 0 Oct 17 - 0:00 /usr/sbin/snmpd
root 45176 43364 0 Oct 17 - 0:00 /usr/sbin/snmpmibd
maestro 45436 1 0 Oct 17 - 0:00 /prj/maestro/maestro/bin/netman
root 45722 43364 0 Oct 17 - 0:00 /usr/sbin/inetd
root 45940 43364 0 Oct 17 - 0:00 /usr/sbin/muxatmd
root 46472 43364 0 Oct 17 - 0:00 /usr/sbin/hostmibd
root 46780 43364 0 Oct 17 - 0:00 /etc/ncs/glbd
root 46980 43364 0 Oct 17 - 0:00 /usr/sbin/qdaemon
root 47294 1 0 Oct 17 - 0:00 /usr/local/sbin/syslog-ng -f
/usr/local/etc/syslog-ng.conf
root 47484 43364 0 Oct 17 - 0:00 /usr/sbin/rpc.lockd
daemon 48014 43364 0 Oct 17 - 0:00 /usr/sbin/rpc.statd
root 48256 43364 0 Oct 17 - 0:00 /usr/sbin/rpc.mountd
root 48774 43364 0 Oct 17 - 0:00 /usr/sbin/biod 6
root 49058 43364 0 Oct 17 - 0:00 /usr/sbin/writesrv
[LP 1]root@ol16u209:

Another example of ps -ef on a AIX5 system:


# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 Jan 23 - 0:33 /etc/init
root 69706 1 0 Jan 23 - 0:00 /usr/lib/errdemon
root 81940 1 0 Jan 23 - 0:00 /usr/sbin/srcmstr
root 86120 1 2 Jan 23 - 236:39 /usr/sbin/syncd 60
root 98414 1 0 Jan 23 - 0:00 /usr/ccs/bin/shlap64
root 114802 81940 0 Jan 23 - 0:32
/usr/sbin/rsct/bin/IBM.CSMAgentRMd
root 135366 81940 0 Jan 23 - 0:00 /usr/sbin/sshd -D
root 139446 81940 0 Jan 23 - 0:07 /usr/sbin/rsct/bin/rmcd -r
root 143438 1 0 Jan 23 - 0:00 /usr/sbin/uprintfd
root 147694 1 0 Jan 23 - 0:26 /usr/sbin/cron
root 155736 1 0 Jan 23 - 0:00 /usr/local/sbin/syslog-ng -f
/usr/local/etc/syslog-ng.conf
root 163996 81940 0 Jan 23 - 0:00 /usr/sbin/rsct/bin/IBM.ERrmd
root 180226 81940 0 Jan 23 - 0:00
/usr/sbin/rsct/bin/IBM.ServiceRMd
root 184406 81940 0 Jan 23 - 0:00 /usr/sbin/qdaemon
root 200806 1 0 Jan 23 - 0:08
/opt/hitachi/HNTRLib2/bin/hntr2mon -d
root 204906 81940 0 Jan 23 - 0:00 /usr/sbin/rsct/bin/IBM.AuditRMd
root 217200 1 0 Jan 23 - 0:00 ./mflm_manager
root 221298 81940 0 Jan 23 - 1:41
/usr/DynamicLinkManager/bin/dlmmgr
root 614618 1 0 Apr 03 lft0 0:00 -ksh
reserve 1364024 1548410 0 07:10:10 pts/0 0:00 -ksh
root 1405140 1626318 1 08:01:38 pts/0 0:00 ps -ef
root 1511556 614618 2 07:45:52 lft0 0:41 tar -cf /dev/rmt1.1 /spl
reserve 1548410 1613896 0 07:10:10 - 0:00 sshd: reserve@pts/0
root 1613896 135366 0 07:10:01 - 0:00 sshd: reserve [priv]
root 1626318 1364024 1 07:19:13 pts/0 0:00 -ksh

Some more examples:

# nohup somecommand & sleep 1; tail -f preferred-name

# nohup make bzImage &


# tail -f nohup.out

# nohup make modules 1> modules.out 2> modules.err &


# tail -f modules.out

==========================================
9. Backup commands, TAR, and Zipped files:
==========================================

For SOLARIS as well as AIX, the following commands can be used:


tar, cpio, dd, gzip/gunzip, compress/uncompress, backup and restore

9.1 tar: Short for �Tape Archiver�:


===================================
Some examples should explain the usage of "tar" to create backups, or to create
easy to transport .tar files.

Create a backup to tape device 0hc of file sys01.dbf


# tar -cvf /dev/rmt/0hc /u01/oradata/sys01.dbf
# tar -rvf /dev/rmt/0hc /u02/oradata/data_01.dbf

-c create
-r append
-x extract
-v verbose
-t list

Extract the contents of example.tar and display the files as they are extracted.
# tar -xvf example.tar

Create a tar file named backup.tar from the contents of the directory
/home/ftp/pub
# tar -cf backup.tar /home/ftp/pub

list contents of example.tar to the screen


# tar -tvf example.tar

to restore the file /home/bcalkins/.profile from the archive:


- First we do a backup:
# tar -cvf /dev/rmt/0 /home/bcalkins
- And later we do a restore:
# tar -xcf /dev/rmt/0 /home/bcalkins/.profile

If you use an absolute path, you can only restore in "a like" destination
directory.
If you use a relative path, you can restore in any directory.
In this case, use tar with a relative pathname, for example if you want to backup
/home/bcalkins
change to that directory and use

# tar -cvf backup_oracle_201105.tar ./*

To extract the directory conv:

# tar -xvf /dev/rmt0 /u02/oradata/conv

Example:
--------

mt -f /dev/rmt1 rewind
mt -f /dev/rmt1.1 fsf 6
tar -xvf /dev/rmt1.1 /data/download/expdemo.zip

Most common errors messages with tar:


-------------------------------------

-- 0511-169: A directory checksum error on media: MediaName not equal to Number

Possible Causes
From the command line, you issued the tar command to extract files from an archive
that was not created
with the tar command.

-- 0511-193: An error occurred while reading from the media

Possible Causes
You issued the tar command to read an archive from a tape device that has a
different block size
than when the archive was created.

Solution:

# chdev -l rmt0 -a block_size=0

-- File too large:

Extra note of tar command on AIX:


---------------------------------

If you need to backup multiple large mountpoints to a large tape, you might think
you
can use something like:

tar -cvf /dev/rmt1 /spl


tar -rvf /dev/rmt1 /prj
tar -rvf /dev/rmt1 /opt
tar -rvf /dev/rmt1 /usr
tar -rvf /dev/rmt1 /data
tar -rvf /dev/rmt1 /backups
tar -rvf /dev/rmt1 /u01/oradata
tar -rvf /dev/rmt1 /u02/oradata
tar -rvf /dev/rmt1 /u03/oradata
tar -rvf /dev/rmt1 /u04/oradata
tar -rvf /dev/rmt1 /u05/oradata

Actually on AIX this is not OK. The tape will rewind after each tar command,
effectively
you will end up with ONLY the last backupstatement.

You should use the non-rewinding class instead, like for example:

tar -cf /dev/rmt1.1 /spl


tar -cf /dev/rmt1.1 /apps
tar -cf /dev/rmt1.1 /prj
tar -cf /dev/rmt1.1 /software
tar -cf /dev/rmt1.1 /opt
tar -cf /dev/rmt1.1 /usr
tar -cf /dev/rmt1.1 /data
tar -cf /dev/rmt1.1 /backups
#tar -cf /dev/rmt1.1 /u01/oradata
#tar -cf /dev/rmt1.1 /u02/oradata
#tar -cf /dev/rmt1.1 /u03/oradata
#tar -cf /dev/rmt1.1 /u04/oradata
#tar -cf /dev/rmt1.1 /u05/oradata
Use this table to decide on which class to use:

The following table shows the names of the rmt special files and their
characteristics.

Special File Rewind on Close Retension on Open Density Setting


/dev/rmt* Yes No #1
/dev/rmt*.1 No No #1
/dev/rmt*.2 Yes Yes #1
/dev/rmt*.3 No Yes #1
/dev/rmt*.4 Yes No #2
/dev/rmt*.5 No No #2
/dev/rmt*.6 Yes Yes #2
/dev/rmt*.7 No Yes #2

To restore an item from a logical tape, use commands as in the following example:

mt -f /dev/rmt1 rewind
mt -f /dev/rmt1.1 fsf 2 in order to put the pointer to the beginning of block 3.

mt -f /dev/rmt1.1 fsf 7 in order to put the pointer to the beginning of block 8.

Now you can use a command like for example:

tar -xvf /dev/rmt1.1 /backups/oradb/sqlnet.log

Another example:

mt -f /dev/rmt1 rewind
mt -f /dev/rmt1.1 fsf 8
tar -xvf /dev/rmt1.1 /u01/oradata/spltrain/temp01.dbf

Example Backupscript on AIX:


----------------------------

#!/usr/bin/ksh

# BACKUP-SCRIPT SPL SERVER PSERIES 550


# DIT IS DE PRIMAIRE BACKUP, NAAR DE TAPEROBOT RMT1.
# OPMERKING: ER LOOPT NAAST DEZE BACKUP, OOK NOG EEN BACKUP VAN DE
# /backup DISK NAAR DE INTERNE TAPEDRIVE RMT0.

# OMDAT WE NOG NIET GEHEEL IN BEELD HEBBEN OF WE VOORAF DE BACKUP APPLICATIES


MOETEN
# STOPZETTEN, IS DIT SCRIPT NOG IN REVISIE.

# VERSIE: 0.1
# DATUM : 27-12-2005
# DOEL VAN HET SCRIPT:
# - STOPPEN VAN DE APPLICATIES
# - VERVOLGENS BACKUP NAAR TAPE
# - STARTEN VAN DE APPLICATIES
# CONTROLEER VOORAF OF DE TAPELIBRARY GELADEN IS VIA
"/opt/backupscripts/load_lib.sh"

BACKUPLOG=/opt/backupscripts/backup_to_rmt1.log
export BACKUPLOG

DAYNAME=`date +%a`;export DAYNAME


DAYNO=`date +%d`;export DAYNO

########################################
# 1. REGISTRATIE STARTTIJD IN EEN LOG #
########################################

echo "-----------------" >> ${BACKUPLOG}


echo "Start Backup 550:" >> ${BACKUPLOG}
date >> ${BACKUPLOG}

########################################
# 2. STOPPEN APPLICATIES #
########################################

#STOPPEN VAN ALLE ORACLE DATABASES


su - oracle -c "/opt/backupscripts/stop_oracle.sh"
sleep 30

#STOPPEN VAN WEBSPHERE


cd /prj/was/bin
./stopServer.sh server1 -username admin01 -password vga88nt
sleep 30

#SHUTDOWN ETM instances:


su - cissys -c '/spl/SPLDEV1/bin/splenviron.sh -e SPLDEV1 -c "spl.sh -t stop"'
sleep 2
su - cissys -c '/spl/SPLDEV2/bin/splenviron.sh -e SPLDEV2 -c "spl.sh -t stop"'
sleep 2
su - cissys -c '/spl/SPLCONF/bin/splenviron.sh -e SPLCONF -c "spl.sh -t stop"'
sleep 2
su - cissys -c '/spl/SPLPLAY/bin/splenviron.sh -e SPLPLAY -c "spl.sh -t stop"'
sleep 2
su - cissys -c '/spl/SPLTST3/bin/splenviron.sh -e SPLTST3 -c "spl.sh -t stop"'
sleep 2
su - cissys -c '/spl/SPLTST1/bin/splenviron.sh -e SPLTST1 -c "spl.sh -t stop"'
sleep 2
su - cissys -c '/spl/SPLTST2/bin/splenviron.sh -e SPLTST2 -c "spl.sh -t stop"'
sleep 2
su - cissys -c '/spl/SPLDEVP/bin/splenviron.sh -e SPLDEVP -c "spl.sh -t stop"'
sleep 2
su - cissys -c '/spl/SPLPACK/bin/splenviron.sh -e SPLPACK -c "spl.sh -t stop"'
sleep 2
su - cissys -c '/spl/SPLDEVT/bin/splenviron.sh -e SPLDEVT -c "spl.sh -t stop"'
sleep 2

#STOPPEN SSH DEMON


stopsrc -s sshd
sleep 2

date >> /opt/backupscripts/running.log


who >> /opt/backupscripts/running.log

########################################
# 3. BACKUP COMMANDS #
########################################

case $DAYNAME in
Tue) tapeutil -f /dev/smc0 move 256 4116
tapeutil -f /dev/smc0 move 4101 256
;;
Wed) tapeutil -f /dev/smc0 move 256 4117
tapeutil -f /dev/smc0 move 4100 256
;;
Thu) tapeutil -f /dev/smc0 move 256 4118
tapeutil -f /dev/smc0 move 4099 256
;;
Fri) tapeutil -f /dev/smc0 move 256 4119
tapeutil -f /dev/smc0 move 4098 256
;;
Sat) tapeutil -f /dev/smc0 move 256 4120
tapeutil -f /dev/smc0 move 4097 256
;;
Mon) tapeutil -f /dev/smc0 move 256 4121
tapeutil -f /dev/smc0 move 4096 256
;;
esac

sleep 50

echo "Starten van de backup zelf" >> ${BACKUPLOG}


mt -f /dev/rmt1 rewind
tar -cf /dev/rmt1.1 /spl
tar -cf /dev/rmt1.1 /apps
tar -cf /dev/rmt1.1 /prj
tar -cf /dev/rmt1.1 /software
tar -cf /dev/rmt1.1 /opt
tar -cf /dev/rmt1.1 /usr
tar -cf /dev/rmt1.1 /data
tar -cf /dev/rmt1.1 /backups
tar -cf /dev/rmt1.1 /u01/oradata
tar -cf /dev/rmt1.1 /u02/oradata
tar -cf /dev/rmt1.1 /u03/oradata
tar -cf /dev/rmt1.1 /u04/oradata
tar -cf /dev/rmt1.1 /u05/oradata
tar -cf /dev/rmt1.1 /u06/oradata
tar -cf /dev/rmt1.1 /u07/oradata
tar -cf /dev/rmt1.1 /u08/oradata
tar -cf /dev/rmt1.1 /home
tar -cf /dev/rmt1.1 /backups3

sleep 10
# TIJDELIJKE ACTIE
date >> /opt/backupscripts/running.log
ps -ef | grep pmon >> /opt/backupscripts/running.log
ps -ef | grep BBL >> /opt/backupscripts/running.log
ps -ef | grep was >> /opt/backupscripts/running.log
who >> /opt/backupscripts/running.log
defragfs /prj

# EIND TIJDELIJKE ACTIE

########################################
# 4. STARTEN APPLICATIES #
########################################

#STARTEN SSH DEMON


startsrc -s sshd
sleep 2

#STARTEN VAN ALLE ORACLE DATABASES


su - oracle -c "/opt/backupscripts/start_oracle.sh"
sleep 30

#STARTEN ETM instances:


su - cissys -c '/spl/SPLDEV1/bin/splenviron.sh -e SPLDEV1 -c "spl.sh -t start"'
sleep 2
su - cissys -c '/spl/SPLDEV2/bin/splenviron.sh -e SPLDEV2 -c "spl.sh -t start"'
sleep 2
su - cissys -c '/spl/SPLCONF/bin/splenviron.sh -e SPLCONF -c "spl.sh -t start"'
sleep 2
su - cissys -c '/spl/SPLPLAY/bin/splenviron.sh -e SPLPLAY -c "spl.sh -t start"'
sleep 2
su - cissys -c '/spl/SPLTST3/bin/splenviron.sh -e SPLTST3 -c "spl.sh -t start"'
sleep 2
su - cissys -c '/spl/SPLTST1/bin/splenviron.sh -e SPLTST1 -c "spl.sh -t start"'
sleep 2
su - cissys -c '/spl/SPLTST2/bin/splenviron.sh -e SPLTST2 -c "spl.sh -t start"'
sleep 2
su - cissys -c '/spl/SPLDEVP/bin/splenviron.sh -e SPLDEVP -c "spl.sh -t start"'
sleep 2
su - cissys -c '/spl/SPLPACK/bin/splenviron.sh -e SPLPACK -c "spl.sh -t start"'
sleep 2
su - cissys -c '/spl/SPLDEVT/bin/splenviron.sh -e SPLDEVT -c "spl.sh -t start"'
sleep 2

#STARTEN VAN WEBSPHERE


cd /prj/was/bin
./startServer.sh server1 -username admin01 -password vga88nt

sleep 30

########################################
# 5. REGISTRATIE EINDTIJD IN EEN LOG #
########################################

#Laten we het tapenummer en einddtijd registreren in de log:


tapeutil -f /dev/smc0 inventory | head -88 | tail -2 >> ${BACKUPLOG}

echo "Einde backup 550:" >> ${BACKUPLOG}


date >> ${BACKUPLOG}

9.2 compress and uncompress:


============================

# compress -v bigfile.exe
Would compress bigfile.exe and rename that file to bigfile.exe.Z.

# uncompress *.Z
would uncompress the files *.Z

9.3 gzip:
=========

To compress a file using gzip, execute the following command:

# gzip filename.tar

This will become filename.tar.gz

To decompress:

# gzip -d filename.tar.gz
# gunzip filename.tar.gz
# gzip �d users.dbf.gz

9.4 bzip2:
==========

#bzip2 filename.tar
This will become filename.tar.bz2

9.5 dd:
=======

Solaris:
--------

# dd if=<input file> of=<output file> <option=value>

to duplicate a tape:
# dd if=/dev/rmt/0 of=/dev/rmt/1

to clone a disk with the same geometry:


# dd if=/dev/rdsk/c0t1d0s2 of=/dev/rdsk/c0t4d0s2 bs=128

AIX:
----

same command syntax apply to IBM AIX. Here is an AIX pSeries machine with
floppydrive example:

clone a diskette:

# dd if=/dev/fd0 of=/tmp/ddcopy
# dd if=/tmp/ddcopy of=/dev/fd0

Note:

On Linux distros the device associated to the floppy drive is also /dev/fd0

9.6 cpio:
=========

solaris:
--------

cpio <mode><option>
copy-out: cpio -o
copy_in : cpio -i
pass : cpio -p

# cd /var/bigspace
# cpio -idmv Linux9i_Disk1.cpio.gz
# cpio -idmv Linux9i_Disk2.cpio.gz
# cpio -idmv Linux9i_Disk3.cpio.gz

# cpio -idmv < 9204_solaris_release.cpio

# cd /work
# ls -R | cpio -ocB > /dev/rmt/0

# cd /work
# cpio -icvdB < /dev/rmt/0

d will create directories as needed


c will create header information in ascii format for portability
v verbose
c character heading in file

AIX:
----

AIX uses the same syntax. Usually, you should use the following command:

# cpio -idmv < filename.cpio

Copying directories with cpio:


------------------------------

cpio is very good in cloning directories, or making backups, because it copies


files and directories
inclusive their ownership and permissions.

Example:
--------

Just cd to the directory that you want to clone and use a command similar to the
following examples.

# find . -print | cpio -pdl /u/disk11/jdoe/fiber

# find . -print | cpio -pdm /a/dev

# find . -print | cpio -pdl /home/jim/newdir

# find . -print | cpio -pdmv /backups2/CONV2-0212

# find . -print | cpio -pdmv /backups2/SPLcobAS40

# find . -print | cpio -pdmv /backups2/SPLcobAS40sp2

# find . -print | cpio -pdmv /backups2/runtime/SPLTST2

The p in the flags, stands for pass-through

cd /spl/SPLDEV1
find . -print | cpio -pdmv /spl/SPLDEVT
find . -print | cpio -pdmv /backups2/data

# find . -print | cpio -pdmv /data/documentum/dmadmin/backup_1008/dba_cluster


# find . -print | cpio -pdmv /data/documentum/dmadmin/backup_1008/dmw_et3
# find . -print | cpio -pdmv /data/documentum/dmadmin/backup_1008/dmw_et
# find . -print | cpio -pdmv /data/documentum/dmadmin/backup_1508/dmw_eu
find . -print | cpio -pdmv /data/emcdctm/home2

find . -print | cpio -pdmv /data/documentum/dmadmin/backup_1809/dmw_et


find . -print | cpio -pdmv /data/documentum/dmadmin/backup_1809/dmw_et3

find . -print | cpio -pdmv /data/documentum/dmadmin/appl/l13appl


find . -print | cpio -pdmv /data/documentum/dmadmin/appl/l14appl
find . -print | cpio -pdmv /data/documentum/dmadmin/backup_3110/dmw_et
find . -print | cpio -pdmv /appl/emcdctm/dba_save_311007

Example:
--------

Use cpio copy-pass to copy a directory structure to another location:

# find path -depth -print | cpio -pamVd /new/parent/dir

Example:
--------
Become superuser or assume an equivalent role.
Change to the appropriate directory.

# cd filesystem1

Copy the directory tree from filesystem1 to filesystem2 by using a combination of


the find and cpio commands.

# find . -print -depth | cpio -pdm filesystem2

Example:
--------

Copying directories
Both cpio and tar may be used to copy directories while preserving ownership,
permissions, and directory structure.

cpio example:
cd fromdir
find . | cpio -pdumv todir

tar example:
cd fromdir; tar cf - . | (cd todir; tar xfp -)

tar example over a compressed ssh tunnel:


tar cvf - fromdir | gzip -9c | ssh user@host 'cd todir; gzip -cd | tar xpf -'

Errors:
-------

Errors sometimes found with cpio:

cpio: 0511-903
cpio: 0511-904

1.Try using with -c option: cpio -imdcv < filename.cpio

9.7 the pax command:


====================

Same for AIX and SOLARIS.

The pax utility supports several archive formats, including tar and cpio.

The syntax for the pax command is as follows:

pax <mode> <options>

-r: Read mode .when -r is specified, pax extracts the filenames and directories
found in the archive.
The archive is read from disk or tape. If an extracted file is a directory,
the hierarchy
is extracted as well. The extracted files are created relative to the current
directory.
None: List mode. When neither -r or -w is specified, pax displays the filenames
and directories
found in the archive file. The list is written to standard output.

-w: Write mode. If you want to create an archive, you use -w.
Pax writes the contents of the file to the standard output in an archive
format specified
by the -x option.

-rw: Copy mode. When both -r and -w are specified, pax copies the specified files
to
the destination directory.

most important options:


-a = append to the end of an existing archive
-b = block size, multiple of 512 bytes
-c = you can specify filepatterns
-f = specifies the pathname of the input or output archive
-p <string> = aemo
a does not preserve file access time
e preserve everything: user id, group id, filemode bits, etc..
m does not preserve file modification times
o preserve uid and gid
P preserve filemode bits
-x <format> = specifies the archive format.

Examples:

To copy current directory contents to tape, use -w mode and -f


# pax -w -f /dev/rmt0

To list a verbose table of contents stored on tape rmt0, use None mode and f
# pax -v -f /dev/rmt0

9.8 pkzip25:
============

PKZIP Usage:

Usage: pkzip25 [command] [options] zipfile [@list] [files...]

Examples:

View .ZIP file contents: pkzip25 zipfile

Create a .ZIP file: pkzip25 -add zipfile file(s)...

Extract files from .ZIP: pkzip25 -extract zipfile

These are only basic examples of PKZIP's capability

About "-extract" switch:

extract
extract files from a .ZIP file. Its a configurable switch.
-- all - all files in .ZIP file
-- freshen - only files in the .ZIP file that exist in the target directory and
that are "newer" than those files
will be extracted
-- update - files in the .ZIP file which already exist in the target directory and
that are "newer" than those files
as well as files that are "not" in the target directory will be extracted

default = all

Example:

# pkzip25 -ext=up save.zip

9.9 SOLARIS: ufsdump and ufsrestore:


====================================

level 0 is an full backup, 1-9 are incremental backups

Examples:
---------

# ufsdump 0ucf /dev/rmt/0 /users


# ufsdump 0ucf sparc1:/dev/rmt/0 /export/home

# ufsrestore f /dev/rmt/0 filename


# ufsrestore rf sparc1:/dev/rmt/0 filename

9.10 AIX: mksysb:


================

The mksysb command creates an installable image of the rootvg. This is synonym to
say that mksysb creates
a backup of the operating system (that is, the root volume group).
You can use this backup to reinstall a system to its original state after it has
been corrupted.
If you create the backup on tape, the tape is bootable and includes the
installation programs
needed to install from the backup.

To generate a system backup and create an /image.data file (generated by the


mkszfile command) to a tape device
named /dev/rmt0, type:
# mksysb -i /dev/rmt0

To generate a system backup and create an /image.data file with map files
(generated by the mkszfile command)
to a tape device named /dev/rmt1, type:
# mksysb -m /dev/rmt1

To generate a system backup with a new /image.data file, but exclude the files in
directory /home/user1/tmp,
create the file "/etc/exclude.rootvg" containing the line /home/user1/tmp/, and
type:
# mksysb -i -e /dev/rmt1

This command will backup the /home/user1/tmp directory but not the files it
contains.

To generate a system backup file named /mksysb_images/node1 and a new /image.data


file for that image, type:
# mksysb -i /userimage/node1

There will be four images on the mksysb tape, and the fourth image will contain
ONLY rootvg JFS or JFS2
mounted file systems. The target tape drive must be local to create a bootable
tape.

The following is a description of mksysb's four images.

+---------------------------------------------------------+
| Bosboot | Mkinsttape | Dummy TOC | rootvg |
| Image | Image | Image | data |
|-----------+--------------+-------------+----------------|
|<----------- Block size 512 ----------->| Blksz defined |
| | by the device |
+---------------------------------------------------------+

Special notes:
--------------

Note 1: mksysb problem


----------------------

Question:
I'm attempting to restore a mksysb tape to a system that only has 18GB of drive
space available for the Rootvg.
Does the mksysb try to restore these mirrored LVs, or does it just make one copy?
If it is trying to rebuild the mirror, is there a way that I can get around that?

Answer:
I had this same problem and received a successful resolution. I place those same
tasks here:
1) Create a new image.data file, run mkszfile file.
2) Change the image.data as follows:
a) cd /
b) vi image.data
c) In each lv_data stanza of this file, change the values of the copies
line by one-half (i.e. copies = 2, change to copies = 1)
Also, change the number of Physical Volumes "hdisk0 hdisk1" to "hdisk0".
d) Save this file.
3) Create another mksysb from the command line that will utilize the newly edited
image.data file by the command:
mksysb /dev/rmt0 (Do not use smit and do not run with the -i flag,
both will generate a new image.data file
4) Use this new mksysb to restore your system on other box without mirroring.
Note 2: How to restore specific files from a mksysb tape:

---------------------------------------------------------

$ tctl fsf 3
$ restore -xvf /dev/rmt0.1 ./your/file/name

For example, if you need to get the vi command back, put the mksysb tape in the
tape drive
(in this case, /dev/rmt0) and do the following:

cd / # get to the root directory

tctl -f /dev/rmt0 rewind # rewind the tape

tctl -f /dev/rmt0.1 fsf 3 # move the tape to the third file, no rewind

restore -xqf /dev/rmt0.1 -s 1 ./usr/bin/vi # extract the vi binary, no rewind

Further explanation why you must use the fsf 3 (fast forward skip file 3):

The format of the tape is as follows:


1. A BOS boot image
2. A BOS install image
3. A dummy Table Of Contents
4. The system backup of the rootvg

So if you just need to restore some files, first forward the tape pointer to
position 3, counting from 0.

Note 3: How to restore specific files from a mksysb FILE


--------------------------------------------------------

See also note 2

view: restore -Tvqf [mksysb file]


To restore: restore -xvqf [mksysb file] [file name]

Note 4: How to restore a directory from a mksysb FILE


------------------------------------------------------

Simply using the restore command.

restore -xvdf <mksysb.image> ./your/directory

The dot at the front of the path is important.


The "-d" flag indicates that this is a directory and everything in it should
be restored. If you omit that, you'll restore an empty directory.
The directory will be restored underneath whatever directory you're in. So
if you're in your home directory it might create:
/home/azhou/your/directory.

With a mksysb image on disk you don't have any positioning to do, like with
a tape.

Note 5: Performing a mksysb migration with CD installation


----------------------------------------------------------

You can perform a mksysb migration with a CD installation of AIX� 5.3

Step 1. Prepare your system for installation:

Prepare for migrating to the AIX 5.3 BOS by completing the following steps:

- Insert the AIX Volume 1 CD into the CD-ROM device.


- Shut down the target system. If your machine is currently running, power it off
by following these steps:
Log in as the root user.
Type shutdown -F.
If your system does not automatically power off, place the power switch in the
Off (0) position.
Attention: You must not turn on the system unit until instructed to do so.

- Turn on all attached external devices. External devices include the following:
Terminals
CD-ROM drives
DVD-ROM drives
Tape drives
Monitors
External disk drives

Turning on the external devices first is necessary so that the system unit can
identify each peripheral device
during the startup (boot) process.

- If your MKSYSB_MIGRATION_DEVICE is a tape, insert the tape for the mksysb in the
tape drive.
If your MKSYSB_MIGRATION_DEVICE is a CD or DVD, and there is an additional CD or
DVD drive on the system
(other than the one being used to boot AIX), insert the mksysb CD or DVD in the
drive to avoid being
prompted to swap medias.

- Insert your customized bosinst.data supplemental diskette in the diskette drive.


If the system does not
have a diskette drive, use the network installation method for mksysb migration.

Step 2. Boot from your installation media:

The following steps migrate your current version of the operating system to AIX
5.3.
If you are using an ASCII console that was not defined in your previous system,
you must define it.
For more information about defining ASCII consoles, see Step 3. Setting up an
ASCII terminal.

Turn the system unit power switch from Off (0) to On (|).

When the system beeps twice, press F5 on the keyboard (or 5 on an ASCII terminal).
If you have a graphics display,
you will see the keyboard icon on the screen when the beeps occur. If you have an
ASCII terminal
(also called a tty terminal), you will see the word "keyboard" when the beeps
occur.
Note: If your system does not boot using the F5 key (or the 5 key on an ASCII
terminal), refer to your
hardware documentation for information about how to boot your system from an AIX
product CD.

The system begins booting from the installation media. The mksysb migration
installation proceeds
as an unattended installation (non-prompted) unless the MKSYSB_MIGRATION_DEVICE is
the same CD or DVD drive
as the one being used to boot and install the system. In this case, the user is
prompted to switch
the product CD for the mksysb CD or DVD(s) to restore the image.data and the
/etc/filesystems file.
After this happens the user is prompted to reinsert the product media and the
installation continues.
When it is time to restore the mksysb image, the same procedure repeats.

The BOS menus do not currently support mksysb migration, so they cannot be loaded.
In a traditional migration,
if there are errors that can be fixed by prompting the user for information
through the menus,
the BOS menus are loaded. If such errors or problems are encountered during mksysb
migration,
the installation asserts and an error stating that the migration cannot continue
displays.
Depending on the error that caused the assertion, information specific to the
error might be displayed.
If the installation asserts, the LED shows "088".

Note 6: create a mksysb tape MANUALLY


-------------------------------------

THIS NOTE DESCRIBES NOT A SUPPORTED METHOD, AND IS NOT CHECKED..

Here we do not mean the "mksysb -i /dev/rmtx" method, but...:

Question:
I have to clone a standalone 6H1 equipped with a 4mm tape, from
another 6H1 which is node of an SP and which does not own a tape !
The consequence is that my source mksysb is a file that is recorded in
/spdata/sys1/install/aixxxx/images
How will I copy this file to a tape to create the correct mksysb tape
that could be used to restore on my target machine ?

Answer:
using the following method in the case the two server are in the same
AIX level and kernel type (32/64 bits, jfs or jfs2)

- the both servers must communicate over an IP network and have .rhosts
file documented (for using rsh)

cp /var/adm/ras/bosinst.data /bosinst.data
mkszfile

copy these files (bosinst.data and image.data) under "/" on the remote
system

on the server:

tctl -f /dev/rmt0 status


if the block size is not 512:

# chdev -l /dev/rmt0 -a block_size=512


tctl -f /dev/rmt0 rewind
bosboot -a -d /dev/rmt0.1

(create the boot image on the first file of mksysb)

mkinsttape /dev/rmt0.1 (create the second file on the


mksysb with image.data, bosinst.data, and oher files like drivers and
commands)

echo " Dummy tape TOC" | dd of=/dev/rmt0.1 conv=sync bs=512 > /dev/null
2>&1 (create the third file "dummy toc")

create a named pipe:

mknod /tmp/pipe p

and run the mksysb as this:

dd if=/tmp/pipe | rsh "server_hostname" dd of=/dev/rmt0.1 &


mksysb /tmp/pipe

this last command create the fourth file with "rootvg" in backup/restore
format

Note 7: Creating a root volume group backup on CD or DVD with the ISO9660 format
--------------------------------------------------------------------------------

Follow this procedure to create a root volume group backup on CD or DVD with the
ISO9660 format.

You can use Web-based System Manager or SMIT to create a root volume group backup
on CD or DVD with the
ISO9660 format, as follows:
Use the Web-based System Manager Backup and Restore application and select System
backup wizard method.
This method lets you create bootable or non-bootable backups on CD-R, DVD-R, or
DVD-RAM media.
OR

To create a backup to CD, use the smit mkcd fast path.


To create a backup to DVD, use the smit mkdvd fast path and select ISO9660 (CD
format).

The following procedure shows you how to use SMIT to create a system backup to CD.

(The SMIT procedure for creating a system backup to an ISO9660 DVD is similar to
the CD procedure.)
Type the smit mkcd fast path. The system asks whether you are using an existing
mksysb image.
Type the name of the CD-R device. (This can be left blank if the Create the CD
now? field is set to no.)
If you are creating a mksysb image, select yes or no for the mksysb creation
options, Create map files?
and Exclude files?. Verify the selections, or change as appropriate.
The mkcd command always calls the mksysb command with the flags to extend /tmp.

You can specify an existing image.data file or supply a user-defined image.data


file. See step 16.

Enter the file system in which to store the mksysb image. This can be a file
system that you created in the rootvg,
in another volume group, or in NFS-mounted file systems with read-write access. If
this field is left blank,
the mkcd command creates the file system, if the file system does not exist, and
removes it when the command completes.

Enter the file systems in which to store the CD or DVD file structure and final CD
or DVD images. These can be
file systems you created in the rootvg, in another volume group, or in NFS-mounted
file systems. If these fields
are left blank, the mkcd command creates these file systems, and removes them when
the command completes,
unless you specify differently in later steps in this procedure.

If you did not enter any information in the file systems' fields, you can select
to have the mkcd command either
create these file systems in the rootvg, or in another volume group. If the
default of rootvg is chosen
and a mksysb image is being created, the mkcd command adds the file systems to the
exclude file and calls
the mksysb command with the -e exclude files option.

In the Do you want the CD or DVD to be bootable? field, select yes to have a boot
image created on the
CD or DVD. If you select no, you must boot from a product CD at the same
version.release.maintenance level,
and then select to install the system backup from the system backup CD.

If you change the Remove final images after creating CD? field to no, the file
system for the CD images
(that you specified earlier in this procedure) remains after the CD has been
recorded.

If you change the Create the CD now? field to no, the file system for the CD
images (that you specified earlier
in this procedure) remains. The settings that you selected in this procedure
remain valid, but the CD is not
created at this time.

If you intend to use an Install bundle file, type the full path name to the bundle
file. The mkcd command copies
the file into the CD file system. You must have the bundle file already specified
in the BUNDLES field,
either in the bosinst.data file of the mksysb image or in a user-specified
bosinst.data file. When this
option is used to have the bundle file placed on the CD, the location in the
BUNDLES field of the bosinst.data
file must be as follows:
/../usr/sys/inst.data/user_bundles/bundle_file_name

To place additional packages on the CD or DVD, enter the name of the file that
contains the packages list
in the File with list of packages to copy to CD field. The format of this file is
one package name per line.
If you are planning to install one or more bundles after the mksysb image is
restored, follow the directions
in the previous step to specify the bundle file. You can then use this option to
have packages listed
in the bundle available on the CD. If this option is used, you must also specify
the location of installation
images in the next step.

Enter the location of installation images that are to be copied to the CD file
system (if any) in the Location
of packages to copy to CD field. This field is required if additional packages are
to be placed on the CD
(see the previous step). The location can be a directory or CD device.

You can specify the full path name to a customization script in the Customization
script field. If given,
the mkcd command copies the script to the CD file system. You must have the
CUSTOMIZATION_FILE field already set
in the bosinst.data file in the mksysb image or else use a user-specified
bosinst.data file with the CUSTOMIZATION_FILE field set. The mkcd command copies
this file to the RAM file system. Therefore, the path in the CUSTOMIZATION_FILE
field must be as follows:
/../filename

You can use your own bosinst.data file, rather than the one in the mksysb image,
by typing the full path name
of your bosinst.data file in the User supplied bosinst.data file field.
To turn on debugging for the mkcd command, set Debug output? to yes. The debug
output goes to the smit.log.
You can use your own image.data file, rather than the image.data file in the
mksysb image, by typing the
full path name of your image.data file for the User supplied image.data file
field.
Note 8: 0301-150 bosboot: Invalid or no boot device specified!
--------------------------------------------------------------

== Technote:

APAR status
Closed as program error.

Error description

On a system, that does not have tape support


installed, running mkszfile will show the
following error:
0301-150 bosboot: Invalid or no boot device
specified.

Local fix
Install device support for scsi tape devices.

Problem summary
Error message when creating backup if devices.scsi.tape.rte
not installed even if the system does not have a tape drive.

Problem conclusion
Redirect message to /dev/null.

Temporary fix
Ignore message.

Comments
APAR information
APAR number IY52551 IY95261
Reported component name AIX 5L POWER V5
Reported component ID 5765E6200
Reported release 520
Status CLOSED PER
PE NoPE
HIPER NoHIPER
Submitted date 2004-01-12
Closed date 2004-01-12
Last modified date 2004-02-27

== Technote:

APAR status
Closed as program error.

Error description
If /dev/ipldevice is missing, mksfile will show the
bosboot usage statement.

0301-150 bosboot: Invalid or no boot device


specified!
Local fix
Problem summary
If /dev/ipldevice is missing, mksfile will show the
bosboot usage statement.

0301-150 bosboot: Invalid or no boot device


specified!

Problem conclusion
Do not run bosboot against /dev/ipldevice.

Temporary fix
Comments

APAR information
APAR number IY95261
Reported component name AIX 5.3
Reported component ID 5765G0300
Reported release 530
Status CLOSED PER
PE NoPE
HIPER NoHIPER
Submitted date 2007-02-22
Closed date 2007-02-22
Last modified date 2007-06-06

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Publications Referenced

Fix information
Fixed component name AIX 5.3
Fixed component ID 5765G0300

== thread:

Q:

>
> Someone out there knows the fix for this one; if you get a moment, would you
> mind giving me the fix?
>
>
> # mksysb -i /dev/rmt0
>
> /dev/ipldevice not found
>

A:

The ipldevice file is probably deleted from your /dev directory, or


point to wrong
entry. The '/dev/ipldevice' file is (re)created in boot time 2nd
phase. For additional
information look into /sbin/rc.boot script... The ipldevice entry
type is hardlink. Usually point to /dev/rhdiskN, assuming that boot
device is hdiskN.
Check your system and you should got similar ...
find /dev -links 2 -ls
....
8305 0 crw------- 2 root system 14, 1 Feb 20 2005 /dev/rhdisk0
8305 0 crw------- 2 root system 14, 1 Feb 20 2005 /dev/ipldevice
...
(The first cloumn of the output is the inode number)

So, you can recreate the wrong, or missing ipdevice file.


'bootinfo -b' says the physical boot device name.
For exapmle:
ln -f /dev/rhdisk0 /dev/ipldevice

I hope this will solve your bosboot problem.

Q:

I was installing Atape driver and noticed bosboot failure when installp
calls bosboot with /dev/ipldevice. Messages below:

0503-409 installp: bosboot verification starting...


0503-497 installp: An error occurred during bosboot verification
processing.

Inspection of /dev showed no ipldevice file

I was able to easily recreate the /dev/ipldevice using

ln /dev/rhdisk0 /dev/ipldevice

then successfully install the Atape driver software.

After reboot /dev/ipldevice is missing again???.

Environment is p5 520 AIX 5.3 ML1


mirrored internal drives hdisk0 and hdisk1 in rootvg

I have 5.3 ML2 (but have not applied yet)


I don't see any APAR's in ML2 regarding /dev/ipldevice problems.

A:

Are you using EMC disk? There is a known problem with the later
Powerpath versions where the powerpath startup script removes the
/dev/ipldevice file if there is more than one device listed in the
bootlist.

A:

Yes, running EMC PowerPath 4.3 for AIX, with EMC Clariion CX600 Fibre
disks attached to SAN. I always boot from, and mirror the OS on IBM
internal disks. We order 4 internal IBM drives. Two for primary OS and
mirror, the other two for alt_disk and mirrors.

Thanks for the tip. I will investigate at EMC Powerlink site for fix. I
know PowerPath 4.4 for AIX is out, but still pretty new.
A:

ipldevice is a link to the rawdevice (rhdisk0 , not hdisk0)

-----Original Message-----
From: IBM AIX Discussion List [mailto:[email protected]] On Behalf Of
Robert Miller
Sent: Wednesday, April 07, 2004 6:13 PM
To: [email protected]
Subject: Re: 64 Bit Kernel

It may be one of those odd IBMisms where they want to call something a
certain name so they put it in as a link to the actual critter...

Looking on my box, the /dev/ipldevice has the same device major and
minor numbers as hdisk0 - tho it is interesting that ipldevice is a
character device, where a drive is usually a block device:

mybox:rmiller$ ls -l /dev/ipl*
crw------- 2 root system 23, 0 Jan 15 2002 /dev/ipldevice
mybox:rmiller$ ls -l /dev/hdisk0
brw------- 1 root system 23, 0 Sep 13 2002 /dev/hdisk0

A:

> Hi,

> AIX 5.3


> I have a machine where /dev/ipldevice doesn't exit
> I can reboot it safely ?
> How I can I re-create it ?

> Thanks in advance

I did this today, and there is probably a more accepted way.


I made a hard link from my rhdiskX device to /dev/ipldevice.

If your boot device is /dev/hdisk0, then the command line would be as


follows:

ln /dev/rhdisk0 /dev/ipldevice

Again, there is probably a more acceptable way to achieve this, but it


worked for me.

== thread:

how to recover from an invalid or no boot device error in AIX


Description

When running the command "bosboot -ad /dev/ipldevice" in IBM AIX, you get the
following error:
0301-150 bosboot: Invalid or no boot device specified!

A device specified with the bosboot -d command is not valid. The bosboot command
was unable to finish processing
because it could not locate the required boot device. The installp command calls
the bosboot command
with /dev/ipldevice. If this error does occur, it is probably because
/dev/ipldevice does not exist.
/dev/ipldevice is a link to the boot disk.

To determine if the link to the boot device is missing or incorrect :

1) Verify the link exists:

# ls -l /dev/ipldevice
ls: 0653-341 The file /dev/ipldevice does not exist.

2) In this case, it does not exist. To identify the boot disk, enter "lslv -m
hd5". The boot disk name displays.

# lslv -m hd5
hd5:N/A
LP PP1 PV1 PP2 PV2 PP3 PV3
0001 0001 hdisk4 0001 hdisk1

In this example the boot disk name is hdisk4 and hdisk1.

3) Create a link between the boot device indicated and the /dev/ipldevice file.
Enter:

# ln /dev/boot_device_name /dev/ipldevice
(An example of boot_device_name is rhdisk0.)

In my case, I ran:

# ln /dev/rhdisk4 /dev/ipldevice

4) Now run the bosboot command again:

# bosboot -ad /dev/ipldevice


Example

lslv -m hd5; ln /dev/rhdisk4 /dev/ipldevice; bosboot -ad /dev/ipldevice

Note 9: Other mksysb errors on AIX 5.3:


---------------------------------------

It turns out, that on AIX 5.3, on certain ML/TL levels (below TL 6), an mksysb
error turns up,
if you have other volume groups defined other than rootvg, while there is NO
filesystem created on
those Volume groups.

Solution: create a filesystem, even only a "test" or "dummy" filesystem, on those


VG's.
>> thread 1:

Q:

Hi

can't find any information about "backup structure of volume group, vios".
included service:
"savevgstruct vgname" working with errors:
# lsvg
rootvg
vg_dev
datavg_dbs
# /usr/ios/cli/ioscli savevgstruct vg_dev

Creating information file for volume group vg_dev..

Some error messages may contain invalid information


for the Virtual I/O Server environment.

cat: 0652-050 Cannot open /tmp/vgdata/vg_dev/fs_data_tmp.

# ls -al /tmp/vgdata/vg_dev/
total 16
drwxr-xr-x 2 root staff 256 Apr 02 08:38 .
drwxrwxr-x 5 root system 256 Apr 02 08:20 ..
-rw-r--r-- 1 root staff 2002 Apr 02 08:35 filesystems
-rw-r--r-- 1 root staff 1537 Apr 02 08:35 vg_dev.data
# oslevel -r
5300-05
# df -k | grep tmp
/dev/hd3 1310720 1309000 1% 42 1% /tmp

A:

I had this issue as well with VIO 1.3. I called IBM support
about it and it is a known issue. The APAR is IY87935. The fix
will not be released until AIX 5.3 TL 6, which is due out in
June. It occurs when you run savevgstruct on a user defined
volume group that contains volumes where at least one does not
have a filesystem defined on it. The workaround is to define a
filesystem on every volume in the user defined volume group.

>> thread 2:

IBM APAR Note:

http://www-1.ibm.com/support/docview.wss?uid=isg1IY87935

IY87935: MKVGDATA/SAVEVG CAN FAIL

APAR status
Closed as program error.
Error description
The mkvgdata command when executed on a volume group that does
not have any mounted filesystems:

# savevg -f /home/vgbackup -i vg00

Creating information file for volume group vg00..cat:


0652-050 Cannot open /tmp/vgdata/vg00/fs_data_tmp.

/usr/bin/savevg 33 : BACKUPSHRINKSIZE = 16 + FSSHRINKSIZE :


0403-009 The specified number is not valid for this command.

Local fix

Problem summary
The mkvgdata command when executed on a volume group that does
not have any mounted filesystems:

# savevg -f /home/vgbackup -i vg00

Creating information file for volume group vg00..cat:


0652-050 Cannot open /tmp/vgdata/vg00/fs_data_tmp.

/usr/bin/savevg 33 : BACKUPSHRINKSIZE = 16 + FSSHRINKSIZE :


0403-009 The specified number is not valid for this command.

Problem conclusion
Check variable.

Temporary fix

Comments

APAR information
APAR number IY87935
Reported component name AIX 5.3
Reported component ID 5765G0300
Reported release 530
Status CLOSED PER
PE NoPE
HIPER NoHIPER
Submitted date 2006-08-09
Closed date 2006-08-09
Last modified date 2006-08-09

9.11 AIX: the backup and restore commands:


------------------------------------------

The backup command creates copies of your files on a backup medium, such as a
magnetic tape or diskette.
The copies are in one of the two backup formats:

- Specific files and directories, backed up by name using the -i flag.


- Entire file system backed up by i-node, not using the -i flag,
but instead using the Level and FileSystem parameters.

Unless you specify another backupmedia with the -f parameter, the backup command
automatically
writes its output to /dev/rfd0 which is the diskette drive.

(1) Backing up the user directory "userdirectory":


# cd /userdirectory
# find . -depth | backup -i -f /dev/rmt0 # or use find . -print

(2) Incremental backups:


You can create full and incremental backups of filesystems as well, as shown in
the following example.
When the -u flag is used with the backup command, the system will do an
incremental backup
according to the -level number specified. For example, a level 5 backup will only
back up the
data that has changed after the level 4 was made.
Levels can range from 0 to 9.

Example;

On Sunday:
# backup -0 -uf /dev/rmt0 /data
On Monday:
# backup -1 -uf /dev/rmt0 /data
..
..
On Saturday:
# backup -6 -uf /dev/rmt0 /data

Due to the -u parameter, information about the backups is written to the


/etc/dumpdates file.

To backup the / (root) file system, enter:


# backup -0 -u -f /dev/rmt0 /

Note that we do noy use the -i flag, but instead backup an entire fs "/".

Other examples:
---------------

To backup all the files and subdirectories in current directory using relative
pathnames, use
# find . -print | backup -if /dev/rmt0

To backup the files /bosinst.data and /signature to the diskette, use


# ls ./bosinst.dat ./signature | backup -iqv

How to restore a file:


----------------------

Suppose we want to restore the /etc/host file, because its missing.

# tctl -f /dev/rmt0 rewind # - rewind tape


# restore -x -d -v -q -s4 -f /dev/rmt0.1 ./etc/hosts
Another example:

# restore -qvxf /dev/rmt0.1 "./etc/passwd" Restore /etc/passwd file


# restore -s4 -qTvf /dev/rmt0.1 Lists contents of a mksysb tape

9.12 AIX: savevg and restvg:


----------------------------

To backup, or clone, a VG, you can use the

- mksysb command for the rootvg


- savevg command for other user VG's

To backup a user Volume Group (VG, see also sections 30 and 31) you can use savevg
to backup a VG
and restvg to restore a VG.

# lsvg # - shows a list of online VG's


rootvg
uservg

# savevg -if /dev/rmt0 uservg # - now backup the uservg

9.13 AIX: tctl:


---------------

Purpose
Gives subcommands to a streaming tape device.

Syntax
tctl [ -f Device ] [ eof | weof | fsf | bsf | fsr | bsr | rewind | offline |
rewoffl | erase | retension | reset | status ] [ Count ]

tctl [ -b BlockSize ] [ -f Device ] [ -p BufferSize ] [ -v ] [ -n ] [ -B ] {


read | write }

Description
The tctl command gives subcommands to a streaming tape device. If you do not
specify the Device variable
with the -f flag, the TAPE environment variable is used. If the environment
variable does not exist,
the tctl command uses the /dev/rmt0.1 device. (When the tctl command gives the
status subcommand,
the default device is /dev/rmt0.) The Device variable must specify a raw (not
block) tape device.
The Count parameter specifies the number of end-of-file markers, number of file
marks, or number of records.
If the Count parameter is not specified, the default count is 1.

Examples
To rewind the rmt1 tape device, enter:
tctl -f /dev/rmt1 rewind

To move forward two file marks on the default tape device, enter:
tctl fsf 2
To write two end-of-file markers on the tape in /dev/rmt0.6, enter:
tctl -f /dev/rmt0.6 weof 2

To read a tape device formatted in 80-byte blocks and put the result in a file,
enter:
tctl -b 80 read > file

To read variable-length records from a tape device formatted in 80-byte blocks and
put the result in a file, enter:
tctl -b 80 -n read > file

To write variable-length records to a tape device using a buffer size of 1024


byes, enter:
cat file | tctl -b 1024 -n -f/dev/rmt1 write

To write to a tape device in 512-byte blocks and use a 5120-byte buffer for
standard input, enter:
cat file | tctl -v -f /dev/rmt1 -p 5120 -b 512 write

Note: The only valid block sizes for quarter-inch (QIC) tape drives are 0 and 512.
To write over one of several backups on an 8 mm tape, position the tape at the
start of the backup file
and issue these commands:
tctl bsf 1

tctl eof 1

9.14 AIX mt command:


--------------------

Purpose
Gives subcommands to streaming tape device.

Syntax
mt [ -f TapeName ] Subcommand [ Count ]

Description
The mt command gives subcommands to a streaming tape device. If you do not specify
the -f flag
with the TapeName parameter, the TAPE environment variable is used. If the
environment variable
does not exist, the mt command uses the /dev/rmt0.1 device. The TapeName parameter
must be a raw (not block)
tape device. You can specify more than one operation with the Count parameter.

Subcommands

eof, weof Writes the number of end-of-file markers specified by the Count
parameter at the
current position on the tape.
fsf Moves the tape forward the number of files specified by the Count
parameter and positions
it to the beginning of the next file.
bsf Moves the tape backwards the number of files specified by the Count
parameter and positions
it to the beginning of the last file skipped. If using the bsf
subcommand would cause the tape head
to move back past the beginning of the tape, then the tape will be
rewound, and the mt command will return EIO.
fsr Moves the tape forward the number of records specified by the Count
parameter.
bsr Moves the tape backwards the number of records specified by the Count
parameter.
rewoff1, rewind Rewinds the tape. The Count parameter is ignored.
status Prints status information about the specified tape device. The output of
the status command
may change in future implementations

Examples
To rewind the rmt1 tape device, enter:

mt -f /dev/rmt1 rewind
To move forward two files on the default tape device, enter:

mt fsf 2
To write two end-of-file markers on the tape in the /dev/rmt0.6 file, enter:

mt -f /dev/rmt0.6 weof 2

9.14 AIX tapeutil command:


--------------------------

tapeutil -f <devicename> <commands>


- A program which came with the tape library to control it's working. Called
without arguments gives a menu.
Is useful for doing things like moving tapes from the slot to the drive. e.g.

$ tapeutil -f /dev/smc0 move -s 10 -d 23

which moves the tape in slot 10 to the drive (obviously, this will depend on your
own individual tape library,
may I suggest the manual?).

The fileset you need to install for 'tapeutil' command is:


Atape.driver 7.1.5.0.

Example:
--------

We are using 3583 automated tape library for backups.for tapeutil command u need
to have a file atape.sys
on ur system.to identify the positioning of tape drives and source just type
tapeutil it will give
u a number of options.choose element information to identify the source and tape
drive numbers.
In our case the tape drives numbers are 256 and 257 and the source number to
insert the tape is 16.
we usually give the following commands to load and move the tape.

Loading Tape:-
tapeutil -f /dev/smc0 move -s 16 -d 256
(to insert the tape in tapedrive 1,where 16 is source and 256 is destination)
to take the backup:-

find filesystem1 filesystem2 | backup -iqvf /dev/rmt1

((filessystem name without mount point slash))

after taking the backup and unloading tape:-

tapeutil -f /dev/rmt1 unload

tapeutil -f /dev/smc0 move -s 256 -d 16

(first unload the tape then move it to source destination)

this might help u to use the taputil command in taking backup.

Example:
--------

In order to move tapes in and out of the Library here is what I do.

First I unload the tape with the command #tapeutil -f /dev/rmtx unload
Where x is 0,1,2,3...
then I move the tape from external slot (16) using the media changer, not the tape
drive.

#tapeutil -f /dev/smcx move 256 16


The above command moves the tape in your first tape drive (256) to the external
slot.
Note that you can also move from the internal slots to the external slot or the
tape drive.
To move the tape back from the external slot, I just switch 256 and 16 parameters.

Example:
--------

The code I use to list the I/O station slots is:

/usr/bin/tapeutil -f /dev/smc0 inventory | grep -p Station | egrep


'Station|Volume' | awk '{
if($1 =3D=3D "Import/Export") ioslot=3D$4;
if($1 =3D=3D "Volume") {
if(NF =3D=3D 4) volser=3D$4;
else volser=3D"-open-";
print ioslot, volser;
}}'

The tapeutil command to move a tape is:

/usr/bin/tapeutil -f /dev/smc0 move <fromslot> <toslot>

For example: /usr/bin/tapeutil -f /dev/smc0 move 773 1037

You can get the slot numbers, and volsers in them, with the command:
/usr/bin/tapeutil -f /dev/smc0 inventory
To find an open slot just look for a slot with a blank "Volume Tag".

One little hitch, however. If a tape is currently mounted, the "tapeut=il


inventory" command will show a
slot as open ("Volume Tag" is blank), but TSM will have it reserved for=
the
mounted tape. So what I did
in my script is to check the TSM device configuration file for each ope=
n
slot that I find and if that slot number
appears in it then I skip that slot and go on to the next one.

Example:
--------

#!/bin/ksh
DEVICE=$1
HOST=$2
TAPE=$3
case $TAPE in
2) tapeutil -f /dev/smc0 move 23 10
tapeutil -f /dev/smc0 move 11 23
;;
3) tapeutil -f /dev/smc0 move 23 11
tapeutil -f /dev/smc0 move 12 23
;;
4) tapeutil -f /dev/smc0 move 23 12
tapeutil -f /dev/smc0 move 13 23
;;
5) tapeutil -f /dev/smc0 move 23 13
tapeutil -f /dev/smc0 move 14 23
;;
esac

Example:
--------

tapeutil -f /dev/rmt1 unload


tapeutil -f /dev/smc0 move 257 16
tapeutil -f /dev/smc0 move -s 256 -d 16
tapeutil -f /dev/smc0 move 257 1025
tapeutil -f /dev/smc0 move 16 257

tapeutil -f /dev/smc0 exchange 34 16 40


tapeutil -f /dev/smc0 inventory | more
tctl -f/dev/rmt0 rewoffl
tapeutil �f/dev/smc0 elementinfo
tapeutil �f /dev/scm0 inventory

Example:
--------

tapeutil -f /dev/rmt1 unload


sleep 20
DAYNO=`date +%d`;export DAYNO

case $DAYNO in
01) tapeutil -f /dev/smc0 move 23 10
tapeutil -f /dev/smc0 move 11 23
;;
02) tapeutil -f /dev/smc0 move 23 10
tapeutil -f /dev/smc0 move 11 23
;;
03) tapeutil -f /dev/smc0 move 23 10
tapeutil -f /dev/smc0 move 11 23
;;
04) tapeutil -f /dev/smc0 move 23 10
tapeutil -f /dev/smc0 move 11 23
;;
05) tapeutil -f /dev/smc0 move 23 10
tapeutil -f /dev/smc0 move 11 23
;;
06) tapeutil -f /dev/smc0 move 23 10
tapeutil -f /dev/smc0 move 11 23
;;
07) tapeutil -f /dev/smc0 move 23 10
tapeutil -f /dev/smc0 move 11 23
;;
esac

Example:
--------

tapeutil -f /dev/rmt1 unload


sleep 20

DAYNAME=`date +%a`;export DAYNAME

case $DAYNAME in
Sun) tapeutil -f /dev/smc0 move 256 4098
tapeutil -f /dev/smc0 move 4099 256
;;
Mon) tapeutil -f /dev/smc0 move 256 4099
tapeutil -f /dev/smc0 move 4100 256
;;
Tue) tapeutil -f /dev/smc0 move 256 4100
tapeutil -f /dev/smc0 move 4113 256
;;
Wed) tapeutil -f /dev/smc0 move 256 4113
tapeutil -f /dev/smc0 move 4114 256
;;
Thu) tapeutil -f /dev/smc0 move 256 4114
tapeutil -f /dev/smc0 move 4109 256
;;
Fri) tapeutil -f /dev/smc0 move 256 4109
tapeutil -f /dev/smc0 move 4124 256
;;
Sat) tapeutil -f /dev/smc0 move 256 4124
tapeutil -f /dev/smc0 move 4110 256
;;
esac
tapeutil -f /dev/smc0 move 256 4098
tapeutil -f /dev/smc0 move 4099 256

Example:
--------

tapeutil -f /dev/smc0 move 16 4096


sleep 10
tapeutil -f /dev/smc0 move 17 4097
sleep 10
tapeutil -f /dev/smc0 move 18 4098
sleep 10
tapeutil -f /dev/smc0 move 19 4099
sleep 10
tapeutil -f /dev/smc0 move 20 4100
sleep 10
tapeutil -f /dev/smc0 move 21 4101
sleep 10

Example:
--------

mt -f /dev/rmt1 rewind
mt -f /dev/rmt1.1 fsf 6
tar -xvf /dev/rmt1.1 /data/download/expdemo.zip
SPL bld

About Ts3310:
-------------

Abstract
Configuration Information for IBM TS3310 (IBM TotalStorage 3576)

Content

IBM TS3310 (IBM TotalStorage 3576)

Drive Addresses Storage Slot Addresses Changer Address Entry/Exit Slot Address
256-261 4096-4223 1 16-21

Notes:

1. Barcodes are required. Without a barcode label, a volume will show as unknown
media.

2. ELEMent=AUTODetect in the DEFINE/UPDATE DRIVE command is supported.

3. Device identification and firmware used during validation


Library ID: IBM 3576-MTL --- Firmware: 0.62

4. The IBM device driver is required. The IBM device drivers are available at
ftp://ftp.software.ibm.com/storage/devdrvr.

5. The library is available with IBM LTO Generation 3 drives.


6. For more information on IBM TS3310, see TS3310 Tape Library.

Example:
--------

First, list the tape device names:


lsdev -Cc tape
Assume it returns smc0 for the library, and rmt0 and rmt1 for the tape drives, and
all devices are Available.

Next, take an inventory of the library.


tapeutil -f /dev/smc0 inventory | more
Assume the inventory returns two drives with element numbers 256 and 257 and shows
a tape stored in slot 1025.

Then, start moving the tape to each drive in turn, and verify which device name it
is associated with
by running tctl or mt rewoffl. If it returns without error, the device name
matches the element number.

Move the tape from the tape slot to the first drive:
tapeutil -f /dev/smc0 move 1025 256
tctl -f/dev/rmt0 rewoffl
If the command returns with no errors, then element # 256 matches device name
/dev/rmt0.

Move the tape to the next drive


tapeutil -f /dev/smc0 move 256 257
tctl -f/dev/rmt1 rewoffl
If the command returns with no errors, then element # 257 matches device name
/dev/rmt1

Move the tape back to the storage slot it came from:


tapeutil -f /dev/smc0 move 257 1025

If at any point, the tctl command returns with errors, then try another device
name until it returns without errors.

NOTE: the 'rewoffl' flag on tctl simply rewinds and ejects the tape from the
drive.

9.15 Recover from AIX OS failure:


---------------------------------

Recover from OS failure.

Contents:
1. How to view the bootlist:
2. How to change the bootlist:
3. How to make a device bootable:
4. How to make a backup of the OS:
5. Shutdown a pSeries AIX system in the most secure way:
6. How to restore specific files from a mksysb tape:

7. Recovery of rootvg

1. How to view the bootlist:

At boottime, once the POST is completed, the system will search the boot list for
a
bootable image. The system will attempt to boot from the first entry in the
bootlist.
Its always a good idea to see what the OS thinks are the bootable devices and the
order of what the OS
thinks it should use. Use the bootlist command to view the order:

# bootlist -m normal -o

As the first item returned, you will see hdisk0, the bootable harddisk.

If you need to check the bootlist in "service mode", for example if you want to
boot from tape to restore the rootvg, use

# bootlist -m service -o

2. How to change the bootlist:

The bootlist, in normal operations, can be changed using the same command as used
in section 1, for example

# bootlist -m normal hdisk0 cd0

This command makes sure the hdisk0 is the first device used to boot the system.

If you want to change the bootlist for the system in service mode, you can change
the list in order to use rmt0
if you need to restore the rootvg.

# bootlist -m service rmt0

3. How to make a device bootable:

To make a device bootable, use the bosboot command:

# bosboot -ad /dev/ipldevice

So, if hdisk0 must be bootable, or you want to be sure its bootable, use

# bosboot -ad /dev/hdisk0

4. How to make a backup of the OS:


The mksysb command creates an installable image of the rootvg. This is synonym to
say that mksysb creates
a backup of the operating system (that is, the root volume group).

You can use this backup to reinstall a system to its original state after it has
been corrupted.
If you create the backup on tape, the tape is bootable and includes the
installation programs
needed to install from the backup.

To generate a system backup and create an /image.data file (generated by the


mkszfile command) to a tape device
named /dev/rmt0, type:

# mksysb -i /dev/rmt0

If a backup tape was created with the -e switch, like in:

# mksysb -i -e /dev/rmt0

then a number of directories are NOT included in the backup. These exclusions are
listed in the "/etc/exclude.rootvg" file.

The mksysb command should be used regularly. It must certainly be done after
installing apps or devices.
In normal conditions, the OS does not change, and a bootable tape should be
created at some frequency.

5. Shutdown a pSeries AIX system in the most secure way:

1. Shut down all applications in a controlled way.

2. Make sure no users are on the system.


3. Use the shutdown command:

shutdown -r to reboot the system


shutdown -m to reboot in maintenance mode

6. How to restore specific files from a mksysb tape:

$ tctl fsf 3
$ restore -xvf /dev/rmt0.1 ./your/file/name

For example, if you need to get the vi command back, put the mksysb tape in the
tape drive (in this case, /dev/rmt0)
and do the following:

cd / # get to the root directory

tctl -f /dev/rmt0 rewind # rewind the tape


tctl -f /dev/rmt0.1 fsf 3 # move the tape to the third file, no rewind

restore -xqf /dev/rmt0.1 -s 1 ./usr/bin/vi # extract the vi binary, no rewind

Further explanation why you must use the fsf 3 (fast forward skip file 3):

The format of the tape is as follows:


1. A BOS boot image
2. A BOS install image
3. A dummy Table Of Contents
4. The system backup of the rootvg

So if you just need to restore some files, first forward the tape pointer to
position 3, counting from 0.

7. Recovery of rootvg

7.1 Check if the system can boot from tape:


# bootinfo -e

If a 1 is returned, the system can boot from tape, if a 0 is returned a boot from
tape is not supported.

7.2 Recover the rootvg:

One possible method is the following:


1. Check whether the tape is in front of the disk with the bootlist command:
# bootlist -m normal -o
2. Insert the mksysb tape
3. Power on the machine. The system will boot from the tape.
4. The Installation and Maintenance Menu will be displayed.

Welcome to Base Operating System


Installation and Maintenance

Type the number of your choice and press Enter. Choice is indicated by >>>.

>>> 1 Start Install Now with Default Settings

2 Change/Show Installation Settings and Install

3 Start Maintenance Mode for System Recovery

Type 3 and press enter to start maintenance mode.


The next screen you should see is :-

Maintenance

Type the number of your choice and press Enter.

>>> 1 Access a Root Volume Group


2 Copy a System Dump to Removable Media
3 Access Advanced Maintenance Functions
4 Install from a System Backup

>>> Choice [1]:

Type 4 and press enter to install from a system backup.


The next screen you should see is :-

Choose Tape Drive

Type the number of the tape drive containing the system backup to be
installed and press Enter.

Tape Drive Path Name

>>> 1 tape/scsi/ost /dev/rmt0

>>> Choice [1]:

Type the number that corresponds to the tape drive that the mysysb tape
is in and press enter.
The next screen you should see is :-

Welcome to Base Operating System


Installation and Maintenance

Type the number of your choice and press Enter. Choice is indicated by >>>.

>>> 1 Start Install Now with Default Settings

2 Change/Show Installation Settings and Install

3 Start Maintenance Mode for System Recovery

+-----------------------------------------------------
88 Help ? |Select 1 or 2 to install from tape device /dev/rmt0
99 Previous Menu |
|
>>> Choice [1]:

You can now follow your normal mksysb restore procedures.

9.16 HP-UX make_net_recovery:


-----------------------------

There are two ways you can recover from a tape with make_net_recovery. The method
you choose depends on your needs.

- Use make_medialif
This method is useful when you want to create a totally self-contained recovery
tape. The tape will be bootable
and will contain everything needed to recover your system, including the archive
of your system. During recovery,
no access to an Ignite-UX server is needed. Using make_medialif is described
beginning on
�Create a Bootable Archive Tape via the Network� and also on the Ignite-UX server
in the file:
/opt/ignite/share/doc/makenetrec.txt

- Use make_boot_tape
This method is useful when you do not have the ability to boot the target machine
via the network, but are still
able to access the Ignite-UX server via the network for your archive and
configuration data. This could happen
if your machine does not support network boot or if the target machine is not on
the same subnet as the
Ignite-UX server. In these cases, use make_boot_tape to create a bootable tape
with just enough information
to boot and connect with the Ignite-UX server. The configuration files and archive
are then retrieved from the
Ignite-UX server. See the make_boot_tape(1M) manpage for details.

-- make_boot_tape:

make_boot_tape(1M) make_boot_tape(1M)

NAME
make_boot_tape - make a bootable tape to connect to an Ignite-UX
server

SYNOPSIS
/opt/ignite/bin/make_boot_tape [-d device-file-for-tape] [-f config-
file] [-t tmpdir] [-v]

/opt/ignite/bin/make_boot_tape [-d device-file-for-tape] [-g gateway]


[-m netmask] [-t tmpdir] [-v]

DESCRIPTION
The tape created by make_boot_tape is a bootable tape that contains
just enough information to boot the system and then connect to the
Ignite-UX server where the tape was created. Once the target system
has connected with the Ignite-UX server, it can be installed or
recovered using Ignite-UX. The tape is not a fully self-contained
install tape; an Ignite-UX server must also be present. The
configuration information and software to be installed on the target
machine reside on the Ignite-UX server, not on the tape. If you need
to build a fully self-contained recovery tape, see make_recovery(1m)
or make_media_lif(1m).

make_boot_tape is used in situations when you have target machines


that cannot boot via the network from the Ignite-UX server. This
happens either because the machine does not support booting from the
network or because it is not on the same subnet as the Ignite-UX
server. In this case, booting from a tape generated by make_boot_tape
means you do not need to set up a boot helper system. A tape created
by make_boot_tape can be used to kick off a normal Ignite-UX
installation. It can also be used to recover from recovery
configurations saved on the Ignite-UX server.

There is no "target-specific" information on the boot tape. Only


information about the Ignite-UX server is placed on the tape. Thus,
it is possible to initiate an installation of any target machine from
the same boot tape provided that the same Ignite-UX server is used.
Likewise, the target machine can be installed with any operating
system configuration that is available on the Ignite-UX server.

Typically, the make_boot_tape command is run from the Ignite-UX server


that you wish to connect with when booting from the tape later on.

A key file that contains configuration information is called


INSTALLFS. This file exists on the Ignite-UX server at
/opt/ignite/boot/INSTALLFS and is also present on the tape created by
make_boot_tape. See instl_adm(4) for details on the configuration file
syntax. Unless the -f option is used, the configuration information
already present in the INSTALLFS file is used on the tape as well.
The make_boot_tape command will never alter the INSTALLFS file on the
Ignite-UX server; it will only change the copy that is placed on the
tape.

Examples:
---------

Create a boot tape on the default tape drive (/dev/rmt/0m).

# make_boot_tape

Create a boot tape on a specified (non-default) tape drive. Create a


DDS1 device file for the tape drive first. Show as much information
about the tape creation as is possible.

ioscan -fC tape # to get the hardware path


mksf -v -H <hardware path> -b DDS1 -n -a
make_boot_tape -d /dev/<devfile created by mksf> -v

Create a boot tape and replace the configuration information contained


in the INSTALLFS file. Use the /tmp directory for all temporary files
instead of the default /var/tmp.

# instl_adm -d > tmp_config_file


## edit tmp_config_file as appropriate
# make_boot_tape -f tmp_config_file -t /tmp

Create a boot tape and specify a different gateway IP address. Set


the netmask value as well. All other configuration information is from
what is already in /opt/ignite/boot/INSTALLFS.

# make_boot_tape -g 15.23.34.123 -m 255.255.248.0

=============
10. uuencode:
=============

Unix to Unix Encoding. A method for converting files from Binary to ASCII so that
they can be sent across
the Internet via e-mail.

Encode binary file (to uuencoded ASCII file)


uuencode file remotefile
uudecode file

Example:

Encode binary file


uuencode example example.en

Decode encoded file


uudecode example.en

uuencode converts a binary file into an encoded representation that can be sent
using mail(1) .
It encodes the contents of source-file, or the standard input if no source-file
argument is given.
The decode_pathname argument is required. The decode_pathname is included in the
encoded file's header
as the name of the file into which uudecode is to place the binary (decoded) data.

uuencode also includes the permission modes of source-file, (except setuid ,


setgid, and sticky-bits),
so that decode_pathname is recreated with those same permission modes.

example:
The following example packages up a source tree, compresses it, uuencodes it and
mails it to
a user on another system. When uudecode is run on the target system, the file
``src_tree.tar.Z''
will be created which may then be uncompressed and extracted into the original
tree.

# tar cf - src_tree | compress | uuencode src_tree.tar.Z | mail sys1!sys2!user

example:
uuencode <file_a> <file_b> > <uufile> |
| note: here, file_a is encoded and a new file named uufile is produced |
| when you decode file uufile a file named file_b is produced |

# uuencode dipl.doc dipl.doc >dipl.uu


Hier wird die Datei dipl.doc (z.B. ein WinWord-Dokument) in die Datei dipl.uu
umgewandelt. Dabei legen wir fest,
dasz die Datei nach dem Decodieren wieder dipl.doc heiszen soll.

example:
uuencode long_name.tar.Z arc.trz > arc.uue

11. grep command:


=================

# grep Sally people


# grep "Sally Smith" people
# grep -v "^$" people.old > people
# grep -v "^ *$" people.old > people # deletes all blank lines
# grep "S.* D.*" people.old > people
12. sort command:
=================

sort files by size, largest first...


# ls -al | sort +4 -r | more

# sort +1 -2 people
# sort +2b people
# sort +2n +1 people
# sort +1 -2 *people > everybody
# sort -u +1 hardpeople softpeople > everybody # -u=unique
# sort -t: +5 /etc/passw # -t field sep.

cp /etc/hosts /etc/hosts.`date +%o%b%d`

13. SED:
========

Can be used to replace a character sting with a different string.

# sed s/string/newstring file

#sed s/Smith/White/ people.old > people


#sed "s/Sally Smith/Sally White/" people.old > people

you can also use a regular expression, for instance we can put a left margin of 5
spaces on the people file

# sed "s/^/ /" people.old > people


# sed "s/[0-9]*$//" people.old > people (remove numbers)
# sed -e "s/^V^M//" filename > outputfilename

14. AWK:
========

When lines containing `foo' are found, they are printed, because `print $0' means
print the current line:
# awk '/foo/ { print $0 }' BBS-list

looks for all files in the ls listing that matches Nov and it prints the total of
bytes:
# ls -l | awk '$5 == "Nov" { sum += $4 }
END { print sum }'

only print the lines containing Smith from file people:


# awk /Smith/ people

# awk '/gold/' coins.txt


# awk '/gold/ {print $0}' coins.txt
# awk '/gold/ {print $5,$6,$7,$8}' coins.txt
# awk '{if ($3 < 1980) print $3, " ",$5,$6,$7,$8}' coins.txt

# awk '/Smith/ {print $1 "-" $3}' people


# ls -l /home | awk '{total += $5}; END {print total}'
# ls -lR /home | awk '{total += $5}; END {print total}'

Example:
--------

Suppose you have a text file with lines much longer than, for example, 72
characters,
and you want to have a file with lines with a maximum of 72 chars, then you might
use awk
in the following way:

-- Shell file r13.sh:

#!/bin/bash

DIR=/cygdrive/c/exports
FILE=result24.txt

awk -f r13.awk ${DIR}/${FILE} > ${DIR}/${FILE}.new

-- r13.awk

BEGIN { maxlength=72 }
{
l=length();
if (l > 72) {
i=(l/72)
for (j=0; j<i; j++) {
printf "%s\r\n",substr($0, (j*72)+1, maxlength)
}
} else {
printf "%s\r\n",$0
}
}

15. tr command:
===============

Used for translating characters in a file. tr works on standard input, so if you


want
to take input from a file you have to redirect standard input so that it comes
from that file.

Suppose we want to replace all characters in the


range a-z by the characters A-Z

# tr "[a-z]" "[A-Z]" < people

squeeze muliple occurences osf a character (e.g. a space) in one


# tr -s " " people.old > people

remove blank lines:


# tr -s "\012" < people.old > people
to remove the evil microsoft carriage return.
# tr -d '\015' < original.file > new.file

# cat filename1 | tr -d "^V^M" > newfile

#! /bin/sh
#
# recursive dark side repair technique
# eliminates spaces in file names from current directory down
# useful for supporting systems where clueless vendors promote NT
#
for name in `find . -depth -print`
do
na=`echo "$name" | tr ' ' '_'`
if [ "$na" != "$name" ]
then
echo "$name"
fi
done

note:

> I have finally competed setting up the samba server and setup the share
> between NT and Samba server.
>
> However, when I open a unix text file in Windows NT using notepad, i see
> many funny characters and the text file is not in order (Just like when I
> ftp the unix text file out into NT in binary format) ...I think this has to
> be something to do with whether the file transfer is in Binary format or
> ASCII ... Is there a parameter to set for this ? I have checked the
> documents ... but couldn't find anything on this ...
>

This is a FAQ, but it brief, it's like this. Unix uses a single newline
character to end a line ("\n"), while DOS/Win/NT use a
carriage-return/newline pair ("\r\n"). FTP in ASCII mode translates
these for you. FTP in binary mode, or other forms of file transfer, such
as Samba, leave the file unaltered. Doing so would be extremely
dangerous, as there's no clear way to isolate which files should be
translated

You can get Windows editors that understand Unix line-end conventions
(Ultra Edit is one), or you can use DOS line endings on the files, which
will then look odd from the Unix side. You can stop using notepad, and
use Wordpad instead, which will deal appropriately with Unix line
endings.

You can convert a DOS format text file to Unix with this:-

tr -d '\r' < dosfile.txt > unixfile.txt

The best solution to this seems to be using a Windows editor that can
handle working with Unix line endings.

HTH

Mike.
Note:

There are two ways of moving to a new line...carriage return, which is chr(13),
and new line which is chr(10). In windows you're supposed to use a sequence
of a carriage return followed by a new line.
For example, in VB you can use Wrap$=Chr$(13)+Chr$(10) which creates a wrap
character.

16. cut and paste:


==================

cutting columns:

# cut -c17, 18, 19 people


# cut -c17- people > phones
# cut -c1-16 people > names

cutting fields:

#cut -d" " -f1,2 people > names # -d field seperator

paste:

# paste -d" " firstname lastname phones > people

17. mknod:
==========

mknod creates a FIFO (named pipe), character special file, or block special file
with the specified name.
A special file is a triple (boolean, integer, integer) stored in the filesystem.
The boolean chooses between character special file and block special file.
The two integers are the major and minor device number.

Thus, a special file takes almost no place on disk, and is used only for
communication
with the operating system, not for data storage. Often special files refer to
hardware devices
(disk, tape, tty, printer) or to operating system services (/dev/null,
/dev/random).

Block special files usually are disk-like devices


(where data can be accessed given a block number, and e.g. it is meaningful to
have a block cache).
All other devices are character special files.
(Long ago the distinction was a different one:
I/O to a character special file would be unbuffered, to a block special file
buffered.)

The mknod command is what creates files of this type.

The argument following name specifies the type of file to make:

p for a FIFO
b for a block (buffered) special file
c for a character (unbuffered) special file

When making a block or character special file, the major and minor device numbers
must be given
after the file type (in decimal, or in octal with leading 0; the GNU version also
allows hexadecimal
with leading 0x). By default, the mode of created files is 0666 (`a/rw') minus the
bits set in the umask.

In /dev we find logical devices, created by the mknod command.


# mknod /dev/kbd c 11 0
# mknod /dev/sunmouse c 10 6
# mknod /dev/fb0 c 29 0

create a pipe in /dev called 'rworldlp'

# mknod /dev/rworldlp p; chmod a+rw /dev/rworldlp

If one cannot afford to buy extra disk space one can run the export and compress
utilities simultaneously.
This will prevent the need to get enough space for both the export file AND the
compressed export file. Eg:

# Make a pipe
mknod expdat.dmp p # or mkfifo pipe
# Start compress sucking on the pipe in background
compress < expdat.dmp > expdat.dmp.Z &
# Wait a second or two before kicking off the export
sleep 5
# Start the export
exp scott/tiger file=expdat.dmp

Create a compressed export on the fly.

# create a named pipe


mknod exp.pipe p
# read the pipe - output to zip file in the background
gzip < exp.pipe > scott.exp.gz &
# feed the pipe
exp userid=scott/tiger file=exp.pipe ...

18. Links:
==========

A symbolic link is a pointer or an alias to another file. The command

# ln -s fromfile /other/directory/tolink

makes the file fromfile appear to exist at /other/directory/tolink simultaneously.

The file is not copied, it merely appears to be a part of the file tree in two
places.
Symbolic links can be made to both files and directories.

The usage of the link command is.

%ln -s ActualFilename LinkFileName

Where -s indicates a symbolic link. ActualFilename is the name of the file which
is to be linked to,
and LinkFileName is the name by which the file should be known.

You should use full paths in the command.

This example shows copying three files from a directory into the current working
directory.

[2]%cp ~team/IntroProgs/MoreUltimateAnswer/more*
[3]%ls -l more*
-rw-rw-r-- 1 mrblobby mrblobby 632 Sep 21 18:12 moreultimateanswer.adb
-rw-rw-r-- 1 mrblobby mrblobby 1218 Sep 21 18:19 moreultimatepack.adb
-rw-rw-r-- 1 mrblobby mrblobby 784 Sep 21 18:16 moreultimatepack.ads

The three files take a total of 2634 bytes. The equivalent ln commands would be:

[2]%ln -s ~team/IntroProgs/MoreUltimateAnswer/moreultimateanswer.adb .
[3]%ln -s ~team/IntroProgs/MoreUltimateAnswer/moreultimatepack.adb .
[4]%ln -s ~team/IntroProgs/MoreUltimateAnswer/moreultimatepack.adb .
[5]%ls -l
lrwxrwxrwx 1 mrblobby mrblobby 35 Sep 22 08:50 moreultimateanswer.adb
->
/users/team/IntroProgs/MorUltimateAnswer/moreultimateanswer.a
db
lrwxrwxrwx 1 mrblobby mrblobby 37 Sep 22 08:49 moreultimatepack.adb ->

/users/team/IntroProgs/MorUltimateAnswer/moreultimatepack.adb
lrwxrwxrwx 1 mrblobby mrblobby 37 Sep 22 08:50 moreultimatepack.ads ->
/users/team/IntroProgs/MorUltimateAnswer/moreultimatepack.ads

19. Relink van Oracle:


======================

info:

showrev -p
pkginfo -i

relink:

mk -f $ORACLE_HOME/rdbms/lib/ins_rdbms.mk install
mk -f $ORACLE_HOME/svrmgr/lib/ins_svrmgr.mk install
mk -f $ORACLE_HOME/network/lib/ins_network.mk install
20. trace:
==========

20.1 truss on Solaris:


----------------------

truss -aef -o /tmp/trace svrmgrl

To trace what a Unix process is doing enter:

truss -rall -wall -p <PID>


truss -p $ lsnrctl dbsnmp_start

NOTE: The "truss" command works on SUN and Sequent. Use "tusc" on HP-UX, "strace"
on Linux,
"trace" on SCO Unix or call your system administrator to find the equivalent
command on your system.
Monitor your Unix system:

Solaris:

Truss is used to trace the system/library calls (not user calls) and signals
made/received
by a new or existing process. It sends the output to stderr.

NOTE: Trussing a process throttles that process to your display speed. Use -wall
and -rall sparingly.
Truss usage

truss -a -e -f -rall -wall -p


truss -a -e -f -rall -wall

-a Show arguments passed to the exec system calls


-e Show environment variables passed to the exec system calls
-f Show forked processes
(they will have a different pid: in column 1)
-rall Show all read data (default is 32 bytes)
-wall Show all written data (default is 32 bytes)
-p Hook to an existing process (must be owner or root)
<program> Specify a program to run

Truss examples
# truss -rall -wall -f -p <PID>
# truss -rall -wall lsnrctl start
# truss -aef lsnrctl dbsnmp_start

20.2 syscalls command on AIX:


-----------------------------

1. syscalls Command

Purpose
Provides system call tracing and counting for specific processes and the system.

Syntax
To Create or Destroy Buffer:
syscalls [ [ -enable bytes ]| -disable ]

To Print System Call Counts:


syscalls -c

To Print System Call Events or Start Tracing:


syscalls [ -o filename ] [ -t ] { [ [ -p pid ] -start | -stop ] | -x program }

Description
The syscalls (system call tracing) command, captures system call entry and exit
events by individual processes
or all processes on the system. The syscalls command can also maintain counts for
all system calls
made over long periods of time.

Notes:
System call events are logged in a shared-memory trace buffer. The same shared
memory identifier may be used
by other processes resulting in a collision. In such circumstances, the -enable
flag needs to be issued.
The syscalls command does not use the trace daemon.
The system crashes if ipcrm -M sharedmemid is run after syscalls has been run.
Run stem -shmkill instead of running ipcrm -M to remove the shared memory segment.

Flags
-c Prints a summary of system call counts for all processes. The counters are not
reset.

-disable Destroys the system call buffer and disables system call tracing and
counting.

-enable bytes Creates the system call trace buffer. If this flag is not used, the
syscalls command
creates a buffer of the default size of 819,200 bytes. Use this flag if events
are not being logged
in the buffer. This is the result of a collision with another process using the
same shared memory buffer ID.

-o filename Prints output to filename rather than standard out.

-p pid When used with the -start flag, only events for processes with this pid
will be logged
in the syscalls buffer. When used with the -stop option, syscalls filters the
data in the buffer
and only prints output for this pid.

-start Resets the trace buffer pointer. This option enables the buffer if it does
not exist and resets
the counters to zero.

-stop Stops the logging of system call events and prints the contents of the
buffer.

-t Prints the time associated with each system call event alongside the event.

-x program Runs program while logging events for only that process. The buffer is
enabled if needed.
Security
Access Control: You must be root or a member of the perf group to run this
command.

Examples
To collect system calls for a particular program, enter:
syscalls -x /bin/ps
Output similar to the following appears:
PID TTY TIME CMD
19841 pts/4 0:01 /bin/ksh
23715 pts/4 0:00 syscalls -x /bin/ps
30720 pts/4 0:00 /bin/ps
34972 pts/4 0:01 ksh
PID System Call
30720 .kfork Exit , return=0 Call preceded tracing.
30720 .getpid () = 30720
30720 .sigaction (2, 2ff7eba8, 2ff7ebbc) = 0
30720 .sigaction (3, 2ff7eba8, 2ff7ebcc) = 0
30720 .sigprocmask (0, 2ff7ebac, 2ff7ebdc) = 0
30720 .sigaction (20, 2ff7eba8, 2ff7ebe8) = 0
30720 .kfork () = 31233
30720 .kwaitpid (2ff7ebfc, 31233, 0, 0) = 31233
30720 .sigaction (2, 2ff7ebbc, 0) = 0
30720 .sigaction (3, 2ff7ebcc, 0) = 0
30720 .sigaction (20, 2ff7ebe8, 0) = 0
30720 .sigprocmask (2, 2ff7ebdc, 0) = 0
30720 .getuidx (4) = 0
30720 .getuidx (2) = 0
30720 .getuidx (1) = 0
30720 .getgidx (4) = 0
30720 .getgidx (2) = 0
30720 .getgidx (1) = 0
30720 ._load NoFormat, (0x2ff7ef54, 0x0, 0x0, 0x2ff7ff58) = 537227760
30720 .sbrk (65536) = 537235456
30720 .getpid () = 30720

To produce a count of system calls made by all processes, enter:


syscalls -start
followed by entering:
syscalls -c
Output similar to the following appears:
System Call Counts for all processes
5041 .lseek
4950 .kreadv
744 .sigaction
366 .close
338 .sbrk
190 .kioctl
120 .getuidx
116 .kwritev
108 .kfcntl
105 .getgidx
95 .kwaitpid
92 .gettimer
92 .select
70 .getpid
70 .sigprocmask
52 .execve
51 ._exit
51 .kfork
35 .open
35 ._load
33 .pipe
33 .incinterval
28 .sigreturn
27 .access
16 .brk
15 .times
15 .privcheck
15 .gettimerid
10 .statx
9 .STEM_R10string
4 .sysconfig
3 .P2counters_accum
3 .shmget
3 .shmat
2 .setpgid
2 .shmctl
2 .kioctl
1 .Patch_Demux_Addr_2
1 .Patch_Demux_Addr_High
1 .STEM_R3R4string
1 .shmdt
1 .Stem_KEX_copy_demux_entry
1 .STEM_R3R4string
1 .Patch_Demux_Addr_1
1 .pause
1 .accessx
Files
/usr/bin/syscalls Contains the syscalls command.

20.3 truss command on AIX:


--------------------------

AIX 5.1,5.2,5.3

The truss command is also available for SVR4 UNIX-based environments. This command
is useful for tracing
system calls in one or more processes. In AIX 5.2, all base system call parameter
types are now recognized.
In AIX 5.1, only about 40 system calls were recognized.

Truss is a /proc based debugging tool that executes and traces a command, or
traces an existing process.
It prints names of all system calls made with their arguments and return code.
System call parameters are
displayed symbolically. It prints information about all signals received by a
process. The AIX 5.2 version
supports library calls tracing. For each call, it prints parameters and return
codes.
It can also trace a subset of libraries and a subset of routines in a given
library. The timestamps on each line
are also supported.

In AIX 5.2, truss is packaged with bos.sysmgt.serv_aid, which is installable from


the AIX base installation media.
See the command reference for details and examples, or use the information below.

-a Displays the parameter strings that are passed in each executed system call.

# truss �a sleep

execve("/usr/bin/sleep", 0x2FF22980, 0x2FF22988) argc: 1


argv: sleep
sbrk(0x00000000) = 0x200007A4
sbrk(0x00010010) = 0x200007B0
getuidx(4) = 0


__loadx(0x01000080, 0x2FF1E790, 0x00003E80, 0x2FF22720, 0x00000000) =
0xD0077130 access("/usr/lib/nls/msg/en_US/sleep.cat", 0) = 0
_getpid() = 31196
open("/usr/lib/nls/msg/en_US/sleep.cat", O_RDONLY) = 3
kioctl(3, 22528, 0x00000000, 0x00000000) Err#25 ENOTTY
kfcntl(3, F_SETFD, 0x00000001) = 0
kioctl(3, 22528, 0x00000000, 0x00000000) Err#25 ENOTTY
kread(3, "\0\001 �\001\001 I S O 8".., 4096) = 123
lseek(3, 0, 1) = 123
lseek(3, 0, 1) = 123
lseek(3, 0, 1) = 123
_getpid() = 31196
lseek(3, 0, 1) = 123
Usage: sleep Seconds
kwrite(2, " U s a g e : s l e e p".., 21) = 21
kfcntl(1, F_GETFL, 0x00000000) = 2
kfcntl(2, F_GETFL, 0x00000000) = 2
_exit(2)

-c Counts traced system calls, faults, and signals rather than displaying trace
results line by line.
A summary report is produced after the traced command terminates or when truss is
interrupted.
If the -f flag is also used, the counts include all traced Syscalls, Faults, and
Signals for child processes.

# truss �c ls

syscall seconds calls errors


execve .00 1
__loadx .00 17
_exit .00 1
close .00 2
kwrite .00 5
lseek .00 1
setpid .00 1
getuidx .00 19
getdirent .00 3
kioctl .00 3
open .00 1
statx .00 2
getgidx .00 18
sbrk .00 4
access .00 1
kfcntl .00 6
---- --- ---
sys totals: .01 85 0
usr time: .00
elapsed: .01

More truss examples:


--------------------

truss -o /tmp/tst -p 307214

root@zd93l14:/tmp#cat tst
= 0
_nsleep(0x4128B8E0, 0x4128B958) = 0
_nsleep(0x4128B8E0, 0x4128B958) = 0
_nsleep(0x4128B8E0, 0x4128B958) = 0
_nsleep(0x4128B8E0, 0x4128B958) = 0
thread_tsleep(0, 0xF033159C, 0x00000000, 0x43548E38) = 0
thread_tsleep(0, 0xF0331594, 0x00000000, 0x434C3E38) = 0
thread_tsleep(0, 0xF033158C, 0x00000000, 0x4343FE38) = 0
thread_tsleep(0, 0xF0331584, 0x00000000, 0x433BBE38) = 0
thread_tsleep(0, 0xF0331574, 0x00000000, 0x432B2E38) = 0
thread_tsleep(0, 0xF033156C, 0x00000000, 0x4322EE38) = 0
thread_tsleep(0, 0xF0331564, 0x00000000, 0x431AAE38) = 0
thread_tsleep(0, 0xF0331554, 0x00000000, 0x42F99E38) = 0
thread_tsleep(0, 0xF033154C, 0x00000000, 0x4301DE38) = 0
thread_tsleep(0, 0xF0331534, 0x00000000, 0x42E90E38) = 0
thread_tsleep(0, 0xF033152C, 0x00000000, 0x42E0CE38) = 0
thread_tsleep(0, 0xF033157C, 0x00000000, 0x43337E38) = 0
thread_tsleep(0, 0xF0331544, 0x00000000, 0x42F14E38) = 0
= 0
thread_tsleep(0, 0xF033153C, 0x00000000, 0x42D03E38) = 0
_nsleep(0x4128B8E0, 0x4128B958) = 0

20.4 man pages for truss AIX:


-----------------------------

Purpose

Traces a process's system calls, dynamically loaded user level function calls,
received signals, and incurred machine faults.

Syntax

truss [ -f] [ -c] [ -a] [ -l ] [ -d ] [ -D ] [ -e] [ -i] [ { -t | -x} [!]


Syscall [...] ] [ -s [!] Signal [...] ] [ { -m }[!] Fault [...]] [ { -r | -w}
[!] FileDescriptor [...] ] [ { -u } [!]LibraryName [...]:: [!]FunctionName [ ...
] ] [ -o Outfile] {Command| -p pid [. . .]}

Description
The truss command executes a specified command, or attaches to listed process
IDs, and produces a trace of the system calls, received signals, and machine
faults a process incurs. Each line of the trace output reports either the Fault
or Signal name, or the Syscall name with parameters and return values. The
subroutines defined in system libraries are not necessarily the exact system
calls made to the kernel. The truss command does not report these subroutines,
but rather, the underlying system calls they make. When possible, system call
parameters are displayed symbolically using definitions from relevant system
header files. For path name pointer parameters, truss displays the string being
pointed to. By default, undefined system calls are displayed with their name,
all eight possible argments and the return value in hexadecimal format.

When the -o flag is used with truss, or if standard error is redirected to a


non-terminal file, truss ignores the hangup, interrupt, and signals processes.
This facilitates the tracing of interactive programs which catch interrupt and
quit signals from the terminal.

If the trace output remains directed to the terminal, or if existing processes


are traced (using the -p flag), then truss responds to hangup, interrupt, and
quit signals by releasing all traced processes and exiting. This enables the
user to terminate excessive trace output and to release previously existing
processes. Released processes continue to function normally.

Flags

-a Displays the parameter strings which are passed in each executed system call.

-c Counts traced system calls, faults, and signals rather than displaying trace
results line by line. A summary report is produced after the traced command
terminates or when truss is interrupted. If the -f flag is also used, the counts
include all traced Syscalls, Faults, and Signals for child processes.

-d A timestamp will be included with each line of output. Time displayed is in


seconds relative to the beginning of the trace. The first line of the trace
output will show the base time from which the individual time stamps are
measured. By default timestamps are not displayed.

-D Delta time is displayed on each line of output. The delta time represents the
elapsed time for the LWP that incurred the event since the last reported event
incurred by that thread. By default delta times are not displayed.

-e Displays the environment strings which are passed in each executed system
call.

-f Follows all children created by the fork system call and includes their
signals, faults, and system calls in the trace output. Normally, only the
first-level command or process is traced. When the -f flag is specified, the
process id is included with each line of trace output to show which process
executed the system call or received the signal.

-i Keeps interruptible sleeping system calls from being displayed. Certain


system calls on terminal devices or pipes, such as open and kread, can sleep for
indefinite periods and are interruptible. Normally, truss reports such sleeping
system calls if they remain asleep for more than one second. The system call is
then reported a second time when it completes. The -i flag causes such system
calls to be reported only once, upon completion.

-l Display the id (thread id) of the responsible LWP process along with truss
output. By default LWP id is not displayed in the output.

-m [!]Fault Traces the machine faults in the process. Machine faults to trace
must be separated from each other by a comma. Faults may be specified by name or
number (see the sys/procfs.h header file). If the list begins with the "!"
symbol, the specified faults are excluded from being traced and are not
displayed with the trace output. The default is -mall -m!fltpage.

-o Outfile Designates the file to be used for the trace output. By default, the
output goes to standard error.

-p Interprets the parameters to truss as a list of process ids for an existing


process rather than as a command to be executed. truss takes control of each
process and begins tracing it, provided that the user id and group id of the
process match those of the user or that the user is a privileged user.

-r [!] FileDescriptor Displays the full contents of the I/O buffer for each read
on any of the specified file descriptors. The output is formatted 32 bytes per
line and shows each byte either as an ASCII character (preceded by one blank) or
as a two-character C language escape sequence for control characters, such as
horizontal tab (\t) and newline (\n). If ASCII interpretation is not possible,
the byte is shown in two-character hexadecimal representation. The first 16
bytes of the I/O buffer for each traced read are shown, even in the absence of
the -r flag. The default is -r!all.

-s [!] Signal Permits listing Signals to trace or exclude. Those signals


specified in a list (separated by a comma) are traced. The trace output reports
the receipt of each specified signal even if the signal is being ignored, but
not blocked, by the process. Blocked signals are not received until the process
releases them. Signals may be specified by name or number (see sys/signal.h). If
the list begins with the "!" symbol, the listed signals are excluded from being
displayed with the trace output. The default is -s all.

-t [!] Syscall Includes or excludes system calls from the trace process. System
calls to be traced must be specified in a list and separated by commas. If the
list begins with an "!" symbol, the specified system calls are excluded from the
trace output. The default is -tall.

-u [!] [LibraryName [...]::[!]FunctionName [...] ]

Traces dynamically loaded user level function calls from user libraries. The
LibraryName is a comma-separated list of library names. The FunctionName is a
comma-separated list of function names. In both cases the names can include
name-matching metacharacters *, ?, [] with the same meanings as interpreted by
the shell but as applied to the library/function name spaces, and not to files.

A leading ! on either list specifies an exclusion list of names of libraries or


functions not to be traced. Excluding a library excludes all functions in that
library. Any function list following a library exclusion list is ignored.
Multiple -u options may be specified and they are honored left-to-right. By
default no library/function calls are traced.

-w [!] FileDescriptor Displays the contents of the I/O buffer for each write on
any of the listed file descriptors (see -r). The default is -w!all.

-x [!] Syscall Displays data from the specified parameters of traced sytem calls
in raw format, usually hexadecimal, rather than symbolically. The default is
-x!all.
Examples

1. To produce a trace of the find command on the terminal, type:

truss find . -print >find.out

2. To trace the lseek, close, statx, and open system calls, type:

truss -t lseek,close,statx,open find . -print > find.out

3. To display thread id along with regular output for find command, enter:
truss -l find . -print >find.out

4. To display timestamps along with regular output for find command, enter:
truss -d find . -print >find.out

5. To display delta times along with regular output for find command, enter:
truss -D find . -print >find.out

6. To trace the malloc() function call and exclude the strlen() function call
in the libc.a library while running the ls command, enter:
truss -u libc.a::malloc,!strlen ls

7. To trace all function calls in the libc.a library with names starting with
"m" while running the ls command, enter:
truss -u libc.a::m*,!strlen ls

8. To trace all function calls from the library libcurses.a and exclude calls
from libc.a while running executable foo, enter:
truss -u libcurses.a,!libc.a::* foo

9. To trace the refresh() function call from libcurses.a and the malloc()
function call from libc.a while running the executable foo, enter:
truss -u libc.a::malloc -u libcurses.a::refresh foo

20.5 Note: How to trace an AIX machine:


---------------------------------------

The trace facility and commands are provided as part of the Software Trace Service
Aids fileset
named bos.sysmgt.trace.

To see if this fileset is installed, use the following command:

# lslpp -l | grep bos.sysmgt.trace

Taking a trace:
---------------

The events traced are referenced by hook identifiers.


Each hook ID uniquely refers to a particular activity that can be traced.

When tracing, you can select the hook IDs of interest and exclude others that are
not relevant to your problem. A trace hook ID is a 3 digit hexidecimal number
that identifies an event being traced.
Trace hook IDs are defined in the "/usr/include/sys/trchkid.h" file.

The currently defined trace hook IDs can be listed using the trcrpt command:

# trcrpt -j | sort | pg

001 TRACE ON
002 TRACE OFF
003 TRACE HEADER
004 TRACEID IS ZERO
005 LOGFILE WRAPAROUND
006 TRACEBUFFER WRAPAROUND
..
..

The trace daemon configures a trace session and starts the collection of system
events.
The data collected by the trace function is recorded in the trace log. A report
from the trace log
can be generated with the trcrpt command.

When invoked with the -a, -x, or -X flags, the trace daemon is run asynchronously
(i.e. as a background task).
Otherwise, it is run interactively and prompts you for subcommands.

Some trace examples:

# trace -adf -C all -r PURR -o trace.raw


# trace -Jfop fact proc procd filephys filepfsv filepvl filepvld locks -A786578
-Pp -a
# trace -Jfop fact proc procd filephys filepfsv filepvl filepvld locks -Pp -a
# trace -Jfop fact proc procd filephys filepfsv filepvl filepvld locks -Pp -a

Some trcrpt examples:

Examples
1 To format the trace log file and print the result, enter:

trcrpt | qprt
2 To send a trace report to the /tmp/newfile file, enter:

trcrpt -o /tmp/newfile
3 To display process IDs and exec path names in the trace report, enter:

trcrpt pid=on,exec=on -O /tmp/newfile


4 To create trace ID histogram data, enter:

trcrpt -O hist=on
5 To produce a list of all event groups, enter:

trcrpt -G
The format of this report is shown under the trcevgrp command.
6 To generate back-to-back LMT reports from the common and rare buffers,
specify:
trcrpt -M all
7 If, in the above example, the LMT files reside at /tmp/mydir, and we
want the LMT traces to be merged,
specify:

trcrpt -m -M all:/tmp/mydir
8 To merge the system trace with the scdisk.hdisk0 component trace,
specify:

trcrpt -m -l scdisk.hdisk0 /var/adm/ras/trcfile


9 To merge LMT with the system trace while not eliminating duplicate
events, specify:

trcrpt -O removedups=off -m -M all /var/adm/ras/trcfile


10 To merge all component traces in /tmp/mydir with the LMT traces in the
default LMT directory
while showing the source file for each trace event, specify:

trcrpt -O filename=on -m -M all /tmp/mydir


Note: This is equivalent to:

trcrpt -O filename=on -m -M all -l all:/tmp/mydir

Note: If the traces are from a 64-bit kernel, duplicate entries will
be removed. However,
on the 32-bit kernel,
duplicate entries will not be removed since we do not know the CPU IDs
of the entries in the
components traces.

Another example of the usage of trace:


--------------------------------------

>> Obtaining a Sample Trace File

Trace data accumulates rapidly. We want to bracket the data collection as closely
around the area of interest
as possible. One technique for doing this is to issue several commands on the same
command line. For example:

$ trace -a -k "20e,20f" -o ./trcraw ; cp ../bin/track /tmp/junk ; trcstop

captures the execution of the cp command. We have used two features of the trace
command. The -k "20e,20f" option
suppresses the collection of events from the lockl and unlockl functions. These
calls are numerous and add volume
to the report without adding understanding at the level we're interested in. The
-o ./trc_raw option causes the
raw trace output file to be written in our local directory.

Note: This example is more educational if the input file is not already cached in
system memory. Choose as the source
file any file that is about 50KB and has not been touched recently.

>> Formatting the Sample Trace


We use the following form of the trcrpt command for our report:

$ trcrpt -O "exec=on,pid=on" trcraw > /tmp/cp.rpt

This reports both the fully qualified name of the file that is execed and the
process ID that is assigned to it.

A quick look at the report file shows us that there are numerous VMM page assign
and delete events in the trace,
like the following sequence:

1B1 ksh 8525 0.003109888 0.162816 VMM


page delete: V.S=00
00.150E ppage=1F7F
del
ete_in_progress proce
ss_private working_storage

1B0 ksh 8525 0.003141376 0.031488 VMM


page assign: V.S=00
00.2F33 ppage=1F7F
delete_in_progress process_private working_
storage

We are not interested in this level of VMM activity detail at the moment, so we
reformat the trace with:

$ trcrpt -k "1b0,1b1" -O "exec=on,pid=on" trcraw > cp.rpt2

The -k "1b0,1b1" option suppresses the unwanted VMM events in the formatted
output. It saves us from having
to retrace the workload to suppress unwanted events. We could have used the -k
function of trcrpt instead of
that of the trace command to suppress the lockl and unlockl events, if we had
believed that we might need
to look at the lock activity at some point. If we had been interested in only a
small set of events,
we could have specified -d "hookid1,hookid2" to produce a report with only those
events. Since the hook ID
is the left-most column of the report, you can quickly compile a list of hooks to
include or exclude.

A comprehensive list of Trace hook IDs is defined in /usr/include/sys/trchkid.h.

>> Reading a Trace Report

The header of the trace report tells you when and where the trace was taken, as
well as the command that was
used to produce it:

Fri Nov 19 12:12:49 1993


System: AIX ptool Node: 3
Machine: 000168281000
Internet Address: 00000000 0.0.0.0
trace -ak 20e 20f -o -o ./trc_raw

The body of the report, if displayed in a small enough font, looks as follows:
ID PROCESS NAME PID ELAPSED_SEC DELTA_MSEC APPL SYSCALL
KERNEL INTERRUPT
101 ksh 8525 0.005833472 0.107008 kfork
101 ksh 7214 0.012820224 0.031744 execve
134 cp 7214 0.014451456 0.030464 exec cp
../bin/trk/junk

In cp.rpt you can see the following phenomena:

The fork, exec, and page fault activities of the cp process


The opening of the input file for reading and the creation of the /tmp/junk file
The successive read/write system calls to accomplish the copy
The process cp becoming blocked while waiting for I/O completion, and the wait
process being dispatched
How logical-volume requests are translated to physical-volume requests
The files are mapped rather than buffered in traditional kernel buffers, and the
read accesses cause page faults that must be resolved by the Virtual Memory
Manager.
The Virtual Memory Manager senses sequential access and begins to prefetch the
file pages.
The size of the prefetch becomes larger as sequential access continues.

When possible, the disk device driver coalesces multiple file requests into one
I/O request to the drive.
The trace output looks a little overwhelming at first. This is a good example to
use as a learning aid.
If you can discern the activities described, you are well on your way to being
able to use the trace facility
to diagnose system-performance problems.

>> Filtering of the Trace Report

The full detail of the trace data may not be required. You can choose specific
events of interest to be shown.
For example, it is sometimes useful to find the number of times a certain event
occurred. To answer the question
"How many opens occurred in the copy example?" first find the event ID for the
open system call.
This can be done as follows:

$ trcrpt -j | grep -i open

You should be able to see that event ID 15b is the open event. Now, process the
data from the copy example as follows:

$ trcrpt -d 15b -O "exec=on" trc_raw

The report is written to standard output, and you can determine the number of open
subroutines that occurred.
If you want to see only the open subroutines that were performed by the cp
process, run the report command
again using the following:

$ trcrpt -d 15b -p cp -O "exec=on" trc_raw


A Wrapper around trace:
-----------------------

Simple instructions for using the AIX trace facility

>> Five aix commands are used:

-trace
-trcon
-trcoff
-trcstop
-trcrpt

These are described in AIX Commands Reference, Volume 5, but hopefully you won't
have to dig into that.
Scripts to download
I've provided wrappers for the trace and trcrpt commands since there are various
command-line parameters to specify.

-atrace
-atrcrpt

>> Contents atrace:

# To change from the default trace file, set TRCFILE to


# the name of the raw trace file name here; this should
# match the name of the raw trace file in atrcrpt.
# Don't do this on AIX 4.3.3 ML 10, where you'll need
# to use the default trace file, /usr/adm/ras/trcfile
#TRCFILE="-o /tmp/raw"

# trace categories not to collect


IGNORE_VMM="1b0,1b1,1b2,1b3,1b5,1b7,1b8,1b9,1ba,1bb,1bc,1bd,1be"
IGNORE_LOCK=20e,20f
IGNORE_PCI=2e6,2e7,2e8
IGNORE_SCSI=221,223
IGNORE_OTHER=100,10b,116,119,11f,180,234,254,2dc,402,405,469,7ff

IGNORE="$IGNORE_VMM,$IGNORE_LOCK,$IGNORE_PCI,$IGNORE_SCSI,$IGNORE_LVM,$IGNORE_OTHE
R"

trace -a -d -k $IGNORE $TRCFILE

>> Contents atrcrpt:

# To change from the default trace file, set TRCFILE to


# the name of the raw trace file name here; this should
# match the name of the raw trace file in atrace.
# Don't do this on AIX 4.3.3 ML 10, where you'll need
# to use the default trace file, /usr/adm/ras/trcfile
# TRCFILE=/tmp/raw

# edit formatted trace file name here


FMTFILE=/tmp/fmt

trcrpt -O pid=on,tid=on,timestamp=1 $TRCFILE >$FMTFILE


Setup instructions

edit atrace and atrcrpt and ensure that names of files for raw and formatted trace
are appropriate
Please see the comments in the scripts about 4.3.3 ML 10 being broken for trcrpt,
such that the default file name
needs to be used. You may find that specifying non-default filenames does not have
the desired effect.
make atrace and atrcrpt executable via chmod

Data collection

./atrace (this is my wrapper for the trace command)


trcon
(at this point we're collecting the trace; wait for a bit of time to
trace whatever the failure is)
trcoff
trcstop
./atrcrpt (this is my wrapper for formatting the report)

After running atrcrpt, the formatted report will be in file /tmp/fmt.

Sample section of formatted trace


Note that failing system calls generally show "error Esomething" in the race, as
highlighted below.
The second column is the process id and the third column is the thread id. Once
you see something of interest
in the trace, you may want to use grep to pull out all records for that process
id, since in general the trace
is interleaved with the activity of all the processes in the system.

101 14690 19239 statx LR = D0174110


107 14690 19239 lookuppn:
/usr/HTTPServer/htdocs/en_US/manual/ibm/index.htmlxxxxxxxxxxx
107 14690 19239 lookuppn: file not found
104 14690 19239 return from statx. error ENOENT [79 usec]
101 14690 19239 statx LR = D0174110
107 14690 19239 lookuppn:
/usr/HTTPServer/htdocs/en_US/manual/ibm
104 14690 19239 return from statx [36 usec]

Note about an AIX trace on Websphere:


-------------------------------------

In addition to the WebSphere� MQ trace, WebSphere MQ for AIX� users can use the
standard AIX system trace.
AIX system tracing is a two-step process:

>> Gathering the data


>> Formatting the results

WebSphere MQ uses two trace hook identifiers:

X'30D'
This event is recorded by WebSphere MQ on entry to or exit from a subroutine.
X'30E'
This event is recorded by WebSphere MQ to trace data such as that being sent or
received across a
communications network. Trace provides detailed execution tracing to help you to
analyze problems.
IBM� service support personnel might ask for a problem to be re-created with trace
enabled. The files produced
by trace can be very large so it is important to qualify a trace, where possible.
For example, you can optionally
qualify a trace by time and by component.

There are two ways to run trace:

>> Interactively.

The following sequence of commands runs an interactive trace on the program myprog
and ends the trace.

trace -j30D,30E -o trace.file


->!myprog
->q

>> Asynchronously.

The following sequence of commands runs an asynchronous trace on the program


myprog and ends the trace.
trace -a -j30D,30E -o trace.file
myprog
trcstop

You can format the trace file with the command:


trcrpt -t /usr/mqm/lib/amqtrc.fmt trace.file > report.file
report.file is the name of the file where you want to put the formatted trace
output.

20.6 Nice example: Tracing with truss on AIX:


---------------------------------------------

Application tracing displays the calls that an application makes to external


libraries and the kernel.
These calls give the application access to the network, the file system, and the
display. By watching
the calls and their results, you can get some idea of what the application
"expects",
which can lead to a solution.

Each UNIX� system provides its own commands for tracing. This article introduces
you to truss, which Solaris
and AIX� support. On Linux�, you perform tracing with the strace command. Although
the command-line parameters
might be slightly different, application tracing on other UNIX flavors might go by
the names ptrace,
ktrace, trace, and tusc.

>> A classic file permissions problem


One class of problems that plagues systems administrators is file permissions. An
application likely has to open
certain files to do its work. If the open operation fails, the application should
let the administrator know.
However, developers often forget to check the result of functions or, to add to
the confusion, perform the check,
but don't adequately handle the error. For example, here's the output of an
application that's failing to open:

$ ./openapp
This should never happen!

After running the fictitious openapp application, I received the unhelpful (and
false) error message,
This should never happen!. This is a perfect time to introduce truss. Listing 1
shows the same application
run under the truss command, which shows all the function calls that this program
made to outside libraries.

Listing 1. Openapp run under truss

$ truss ./openapp
execve("openapp", 0xFFBFFDEC, 0xFFBFFDF4) argc = 1
getcwd("/export/home/sean", 1015) = 0
stat("/export/home/sean/openapp", 0xFFBFFBC8) = 0
open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT
stat("/opt/csw/lib/libc.so.1", 0xFFBFF6F8) Err#2 ENOENT
stat("/lib/libc.so.1", 0xFFBFF6F8) = 0
resolvepath("/lib/libc.so.1", "/lib/libc.so.1", 1023) = 14
open("/lib/libc.so.1", O_RDONLY) = 3
memcntl(0xFF280000, 139692, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
close(3) = 0
getcontext(0xFFBFF8C0)
getrlimit(RLIMIT_STACK, 0xFFBFF8A0) = 0
getpid() = 7895 [7894]
setustack(0xFF3A2088)
open("/etc/configfile", O_RDONLY) Err#13 EACCES [file_dac_read]
ioctl(1, TCGETA, 0xFFBFEF14) = 0
fstat64(1, 0xFFBFEE30) = 0
stat("/platform/SUNW,Sun-Blade-100/lib/libc_psr.so.1", 0xFFBFEAB0) = 0
open("/platform/SUNW,Sun-Blade-100/lib/libc_psr.so.1", O_RDONLY) = 3
close(3) = 0
This should never happen!
write(1, " T h i s s h o u l d ".., 26) = 26
_exit(3)

Each line of the output represents a function call that the application made along
with the return value,
if applicable. (You don't need to know each function call, but for more
information, you can call up the
man page for the function, such as with the command man open.) To find the call
that is potentially
causing the problem, it's often easiest to start at the end (or as close as
possible to where
the problems start). For example, you know that the application outputs This
should never happen!,
which appears near the end of the output. Chances are that if you find this
message and work your way up
through the truss command output, you'll come across the problem.

Scrolling up from the error message, notice the line beginning with
open("/etc/configfile"...,
which not only looks relevant but also seems to return an error of Err#13 EACCES.
Looking at the man page
for the open() function (with man open), it's evident that the purpose of the
function is to open a file
-- in this case, /etc/configfile -- and that a return value of EACCES means that
the problem is related
to permissions. Sure enough, a look at /etc/configfile shows that the user doesn't
have permissions to read
the file. A quick chmod later, and the application is running properly.

The output of Listing 1 shows two other calls, open() and stat(), that return an
error. Many of the calls
toward the beginning of the application, including the other two errors, are added
by the operating system
as it runs the application. Only experience will tell when the errors are benign
and when they aren't.
In this case, the two errors and the three lines that follow them are trying to
find the location of libc.so.1,
which they eventually do. You'll see more about shared library problems later.

>> The application doesn't start

Sometimes, an application fails to start properly; but rather than exiting, it


just hangs. This behavior is often
a symptom of contention for a resource (such as two processes competing for a file
lock), or the application
is looking for something that is not coming back. This latter class of problems
could be almost anything,
such as a name lookup that's taking a long time to resolve, or a file that should
be found in a certain spot but
isn't there. In any case, watching the application under truss should reveal the
culprit.

While the first code example showed an obvious link between the system call
causing the problem and the file,
the example you're about to see requires a bit more sleuthing. Listing 2 shows a
misbehaving application
called Getlock run under truss.

Listing 2. Getlock run under truss

$ truss ./getlock
execve("getlock", 0xFFBFFDFC, 0xFFBFFE04) argc = 1
getcwd("/export/home/sean", 1015) = 0
resolvepath("/export/home/sean/getlock", "/export/home/sean/getlock", 1023) = 25
resolvepath("/usr/lib/ld.so.1", "/lib/ld.so.1", 1023) = 12
stat("/export/home/sean/getlock", 0xFFBFFBD8) = 0
open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT
stat("/opt/csw/lib/libc.so.1", 0xFFBFF708) Err#2 ENOENT
stat("/lib/libc.so.1", 0xFFBFF708) = 0
resolvepath("/lib/libc.so.1", "/lib/libc.so.1", 1023) = 14
open("/lib/libc.so.1", O_RDONLY) = 3
close(3) = 0
getcontext(0xFFBFF8D0)
getrlimit(RLIMIT_STACK, 0xFFBFF8B0) = 0
getpid() = 10715 [10714]
setustack(0xFF3A2088)
open("/tmp/lockfile", O_WRONLY|O_CREAT, 0755) = 3
getpid() = 10715 [10714]
fcntl(3, F_SETLKW, 0xFFBFFD60) (sleeping...)

The final call, fcntl(), is marked as sleeping, because the function is blocking.
This means that the function
is waiting for something to happen, and the kernel has put the process to sleep
until the event occurs. To determine
what the event is, you must look at fcntl().

The man page for fcntl() (man fcntl) describes the function simply as "file
control" on Solaris and
"manipulate file descriptor" on Linux. In all cases, fcntl() requires a file
descriptor, which is an integer
describing a file the process has opened, a command that specifies the action to
be taken on the file descriptor,
and finally any arguments required for the specific function. In the example in
Listing 2, the file descriptor is 3,
and the command is F_SETLKW. (The 0xFFBFFD60 is a pointer to a data structure,
which doesn't concern us now.)
Digging further, the man page states that F_SETLKW opens a lock on the file and
waits until the lock can be obtained.

From the first example involving the open() system call, you saw that a successful
call returns a file descriptor.
In the truss output of Listing 2, there are two cases in which the result of
open() returns 3.
Because file descriptors are reused after they are closed, the relevant open() is
the one just above fcntl(),
which is for /tmp/lockfile. A utility like lsof lists any processes holding open a
file. Failing that,
you could trace through /proc to find the process with the open file. However, as
is usually the case,
a file is locked for a good reason, such as limiting the number of instances of
the application or configuring
the application to run in a user-specific directory.

>> Attaching to a running process

Sometimes, an application is already running when a problem occurs. Being able to


run an already-running process
under truss would be helpful. For example, notice that in the output of the Top
application, a certain process
has been consuming 95 percent of the CPU for quite some time, as shown in Listing
3.

Listing 3. Top output showing a CPU-intensive process

PID USERNAME LWP PRI NICE SIZE RES STATE TIME CPU COMMAND
11063 sean 1 0 0 1872K 952K run 87.9H 94.68% udpsend

The -p option to truss allows the owner of the process, or root, to attach to a
running process and view
the system call activity. The process id (PID) is required. In the example shown
in Listing 3, the PID is 11063.
Listing 4 shows the system call activity of the application in question.

Listing 4. truss output after attaching to a running process

$ truss -p 11063:

sendto(3, " a b c", 3, 0, 0xFFBFFD58, 16) = 3


sendto(3, " a b c", 3, 0, 0xFFBFFD58, 16) = 3
sendto(3, " a b c", 3, 0, 0xFFBFFD58, 16) = 3
sendto(3, " a b c", 3, 0, 0xFFBFFD58, 16) = 3
sendto(3, " a b c", 3, 0, 0xFFBFFD58, 16) = 3
sendto(3, " a b c", 3, 0, 0xFFBFFD58, 16) = 3
sendto(3, " a b c", 3, 0, 0xFFBFFD58, 16) = 3
sendto(3, " a b c", 3, 0, 0xFFBFFD58, 16) = 3
. repeats ...

The sendto() function's man page (man sendto) shows that this function is used to
send a message from a socket
-- typically, a network connection. The output of truss shows the file descriptor
(the first 3) and the data
being sent (abc). Indeed, capturing a sample of network traffic with the snoop or
tcpdump tool shows a large amount
of traffic being directed to a particular host, which is likely not the result of
a properly behaving application.

Note that truss was not able to show the creation of file descriptor 3, because
you had attached after the descriptor
was created. This is one limitation of attaching to a running process and the
reason why you should gather
other information using a tool, such as a packet analyzer before jumping to
conclusions.

This example might seem somewhat contrived (and technically it was, because I
wrote the udpsend application
to demonstrate how to use truss), but it is based on a real situation. I was
investigating a process running
on a UNIX-based appliance that had a CPU-bound process. Tracing the application
showed the same packet activity.
Tracing with a network analyzer showed the packets were being directed to a host
on the Internet. After escalating
with the vendor, I determined that the problem was their application failing to
perform proper error checking
on a binary configuration file. The file had somehow become corrupted. As a
result, the application interpreted
the file incorrectly and repeatedly hammered a random IP address with User
Datagram Protocol (UDP) datagrams.
After I replaced the file, the process behaved as expected.

>> Filtering output

After a while, you'll get the knack of what to look for. While it's possible to
use the grep command to go through
the output, it's easier to configure truss to focus only on certain calls. This
practice is common if you're trying
to determine how an application works, such as which configuration files the
application is using. In this case,
the open() and stat() system calls point to any files the application is trying to
open.

You use open() to open a file, but you use stat() to find information about a
file. Often, an application looks for
a file with a series of stat() calls, and then opens the file it wants.

For truss, you add filtering system calls with the -t option. For strace under
Linux, you use -e. In either case,
you pass a comma-separated list of system calls to be shown on the command line.
By prefixing the list with the
exclamation mark (!), the given calls are filtered out of the output. Listing 5
shows a fictitious application
looking for a configuration file.

Listing 5. truss output filtered to show only stat() and open() functions

$ truss -tstat,open ./app


stat("/export/home/sean/app", 0xFFBFFBD0) = 0
open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT
stat("/opt/csw/lib/libc.so.1", 0xFFBFF700) Err#2 ENOENT
stat("/lib/libc.so.1", 0xFFBFF700) = 0
open("/lib/libc.so.1", O_RDONLY) = 3
stat("/export/home/sean/.config", 0xFFBFFCF0) Err#2 ENOENT
stat("/etc/app/configfile", 0xFFBFFCF0) Err#2 ENOENT
stat("/etc/configfile", 0xFFBFFCF0) = 0
open("/etc/configfile", O_RDONLY) = 3

The final four lines are the key here. The stat() function for
/export/home/sean/.config results in ENOENT,
which means that the file wasn't found. The code then tries /etc/app/configfile
before it finds the correct
information in /etc/configfile. The significance of first checking in the user's
home directory is that you
can override the configuration by user.

>> Final thoughts


Whether your operating system uses truss, strace, trace, or something else, the
ability to peer into an application's
behavior is a powerful tool for problem solving. The methodology can be summed up
as follows:

Describe the problem.


Trace the application.
Start at the spot at which the problem occurs and work backward through the system
calls to identify the problem.
Use the man pages for help on interpreting the system calls.
Correct the behavior and test.
Tracing application behavior is a powerful troubleshooting tool, because you're
observing the system calls
that the application makes to the operating system. When the usual problem-solving
methods fail, turn to
application tracing.

20.7. snap command on AIX:


--------------------------

The snap command gathers system configuration information and compresses the
information into a pax file.
The information gathered with the snap command may be required to identify and
resolve system problems.

In normal conditions, the command "snap -gc" should be sufficient. The pax file
will be stored in /tmp/ibmsupt

# snap -gc

create the following file:

/tmp/ibmsupt/snap.pax.Z

Further info:

snap Command

Purpose

Gathers system configuration information.

Syntax

snap [ -a ] [ -A ] [ -b ] [ -B ] [ -c ] [ -C ] [ -D ] [ -f ] [ -g ] [ -G ]
[ -i ] [ -k ] [ -l ] [ -L ][ -n ] [ -N ]
[ -p ] [ -r ] [ -R ] [ -s ] [ -S ] [ -t ] [ -T Filename ] [ -w ] [ -o
OutputDevice ] [ -d Dir ] [ -v Component ]
[ -O FileSplitSize ] [ -P Files ]
[ script1 script2 ... | All | file:filepath ]

snap [ -a ] [ -A ] [ -b ] [ -B ] [ -c ] [ -C ] [ -D ] [ -f ] [ -g ] [ -G ]
[ -i ] [ -k ] [ -l ] [ -L ][ -n ] [ -N ]
[ -p ] [ -r ] [ -R ] [ -s ] [ -S ] [ -t ] [ -T Filename ] [ -o
OutputDevice ] [ -d Dir ] [ -v Component ]
[ -O FileSplitSize ] [ -P Files ] [
script1 script2 ... | All | file:filepath ]

snap -e [ -m Nodelist ] [ -d Dir ]

Description

The snap command gathers system configuration information and compresses


the information into a pax file. The file may then be
written to a device such as tape or DVD, or transmitted to a remote system.
The information gathered with the snap command might be
required to identify and resolve system problems. Note: Root user authority
is required to execute the snap command. Use the snap -o
/dev/cd0 command to copy the compressed image to DVD. Use the snap -o
/dev/rmt0 command to copy the image to tape.

Use the snap -o /dev/rfd0 command to copy the compressed image to diskette.
Use the snap -o /dev/rmt0 command to copy the image to
tape.

Approximately 8MB of temporary disk space is required to collect all system


information, including contents of the error log. If you
do not gather all system information with the snap -a command, less disk
space may be required (depending on the options selected).
Note: If you intend to use a tape to send a snap image to IBM(R) for
software support, the tape must be one of the following formats:
* 8mm, 2.3 Gb capacity
* 8mm, 5.0 Gb capacity
* 4mm, 4.0 Gb capacity

Using other formats prevents or delays IBM software support from being able
to examine the contents.

The snap -g command gathers general system information, including the


following:
* Error report
* Copy of the customized Object Data Manager (ODM) database
* Trace file
* User environment
* Amount of physical memory and paging space
* Device and attribute information
* Security user information

The output of the snap -g command is written to the


/tmp/ibmsupt/general/general.snap file.

The snap command checks for available space in the /tmp/ibmsupt directory,
the default directory for snap command output. You can
write the output to another directory by using the -d flag. If there is not
enough space to hold the snap command output, you must
expand the file system.

Each execution of the snap command appends information to previously


created files. Use the -r flag to remove previously gathered and
saved information.
Flags:

-a
Gathers all system configuration information. This option requires
approximately 8MB of temporary disk space.
-A
Gathers asynchronous (TTY) information.
-b
Gathers SSA information.
-B
Bypasses collection of SSA adapter dumps. The -B flag only works when
the -b flag is also specified; otherwise, the -B flag is
ignored.
-c
Creates a compressed pax image (snap.pax.Z file) of all files in the
/tmp/ibmsupt directory tree or other named output
directory. Note: Information not gathered with this option should be
copied to the snap directory tree before using the -c flag.
If a test case is needed to demonstrate the system problem, copy the
test case to the /tmp/ibmsupt/testcase directory before
compressing the pax file.
-C
Retrieves all the files in the fwdump_dir directory. The files are
placed in the "general" subdirectory. The -C snap option
behaves the same as -P*.
-D
Gathers dump and /unix information. The primary dump device is used.
Notes:
1 If bosboot -k was used to specify the running kernel to be
other than /unix, the incorrect kernel is gathered. Make sure
that /unix is , or is linked to, the kernel in use when the
dump was taken.
2 If the dump file is copied to the host machine, the snap
command does not collect the dump image in the /tmp/ibmsupt/dump
directory. Instead, it creates a link in the dump directory to
the actual dump image.
-d AbsolutePath
Identifies the optional snap command output directory (/tmp/ibmsupt is
the default). You must specify the absolute path.
-e
Gathers HACMP(TM) specific information. Note: HACMP specific data is
collected from all nodes belonging to the cluster . This
flag cannot be used with any other flags except -m and -d.
-f
Gathers file system information.
-g
Gathers the output of the lslpp -hac command, which is required to
recreate exact operating system environments. Writes output
to the /tmp/ibmsupt/general/lslpp.hBc file. Also collects general
system information and writes the output to the
/tmp/ibmsupt/general/general.snap file.
-G
Includes predefined Object Data Manager (ODM) files in general
information collected with the -g flag.
-i
Gathers installation debug vital product data (VPD) information.
21. Logfiles:
=============

Solaris:
--------
Unix message files record all system problems like disk errors, swap errors, NFS
problems, etc.
Monitor the following files on your system to detect system problems:

tail -f /var/adm/SYSLOG
tail -f /var/adm/messages
tail -f /var/log/syslog

You can also use the dmesg command.


Messages are recorded by the syslogd demon.

Diagnostics can be done from the OK prompt after a reboot, like probe-scsci, show-
devs, show-disks, test memory etc..
You can also use SunVTS tool to run diagnostics. SunVTS is Suns's Validation Test
package.

System dumps:
You can manage system dumps by using the dumpadm command.

AIX:
----

Periodical the following files have to be decreased in size. You can use cat
/dev/null command

Example: cat /dev/null >/var/adm/sulog

/var/adm/sulog
/var/adm/cron/log
/var/adm/wtmp
/etc/security/failedlogin

Notes about the errorlog, thats the file /var/adm/ras/errlog.

Do NOT use cat /dev/null to clear the errorlog.


Use instead the following procedure:

# /usr/lib/errstop (stop the error daemon)


move the errlog file
# /usr/lib/errstart (start the error daemon)

errdemon:
---------

On most UNIX systems, information and errors from system events and processes are
managed by the
syslog daemon (syslogd); depending on settings in the configuration file
/etc/syslog.conf, messages are passed
from the operating system, daemons, and applications to the console, to log files,
or to nowhere at all.
AIX includes the syslog daemon, and it is used in the same way that other UNIX-
based operating systems use it.
In addition to syslog, though, AIX also contains another facility for the
management of hardware, operating system,
and application messages and errors. This facility, while simple in its operation,
provides unique and valuable
insight into the health and happiness of an AIX system.

The AIX error logging facility components are part of the bos.rte and the
bos.sysmgt.serv_aid packages,
both of which are automatically placed on the system as part of the base operating
system installation.

Unlike the syslog daemon, which performs no logging at all in its default
configuration as shipped,
the error logging facility requires no configuration before it can provide useful
information about the system.
The errdemon is started during system initialization and continuously monitors the
special file /dev/error
for new entries sent by either the kernel or by applications. The label of each
new entry is checked
against the contents of the Error Record Template Repository, and if a match is
found, additional information
about the system environment or hardware status is added, before the entry is
posted to the error log.

The actual file in which error entries are stored is configurable; the default is
/var/adm/ras/errlog.
That file is in a binary format and so should never be truncated or zeroed out
manually. The errlog file
is a circular log, storing as many entries as can fit within its defined size. A
memory buffer is set
by the errdemon process, and newly arrived entries are put into the buffer before
they are written to the log
to minimize the possibility of a lost entry. The name and size of the error log
file and the size of the memory buffer
may be viewed with the errdemon command:

[aixhost:root:/] # /usr/lib/errdemon -l
Error Log Attributes
--------------------------------------------
Log File /var/adm/ras/errlog
Log Size 1048576 bytes
Memory Buffer Size 8192 bytes

The parameters displayed may be changed by running the errdemon command with other
flags, documented
in the errdemon man page. The default sizes and values have always been sufficient
on our systems,
so I've never had reason to change them.

Due to use of a circular log file, it is not necessary (or even possible) to
rotate the error log.
Without intervention, errors will remain in the log indefinitely, or until the log
fills up with new entries.
As shipped, however, the crontab for the root user contains two entries that are
executed daily,
removing hardware errors that are older than 90 days, and all other errors that
are older than 30 days.

0 11 * * * /usr/bin/errclear -d S,O 30
0 12 * * * /usr/bin/errclear -d H 90

The errdemon deamon constantly checks the /dev/error special file, and when new
data
is written, the deamon conducts a series of operations.

- To determine the path to your system's error logfile, run the command:
# /usr/lib/errdemon -l
Error Log Attributes
Log File /var/adm/ras/errlog
Log Size 1048576 bytes
Memory 8192 bytes

- To change the maximum size of the error log file, enter:


# /usr/lib/errdemon -s 200000

You can generate the error reports using smitty or through the errpt command.

# smitty errpt gives you a dialog screen where you can select types of
information.

# errpt -a
# errpt - d H

# errpt -a|pg Produces a detailed report for each entry in the error log
# errpt -aN hdisk1 Displays an error log for ALL errors occurred on this drive. If
more than a few errors
occur within a 24 hour period, execute the CERTIFY process
under DIAGNOSTICS to determine
if a PV is becoming marginal.

If you use the errpt without any options, it generates a summary report.
If used with the -a option, a detailed report is created.
You can also display errors of a particular class, for example for the Hardware
class.

Examples using errpt:


---------------------

To display a complete summary report, enter:

errpt
To display a complete detailed report, enter:
errpt -a

To display a detailed report of all errors logged for the error identifier
E19E094F, enter:
errpt -a -j E19E094F

To display a detailed report of all errors logged in the past 24 hours, enter:
errpt -a -s mmddhhmmyy

where the mmddhhmmyy string equals the current month, day, hour, minute, and year,
minus 24 hours.
To list error-record templates for which logging is turned off for any error-log
entries, enter:
errpt -t -F log=0

To view all entries from the alternate error-log file


/var/adm/ras/errlog.alternate, enter:
errpt -i /var/adm/ras/errlog.alternate

To view all hardware entries from the alternate error-log file


/var/adm/ras/errlog.alternate, enter:
errpt -i /var/adm/ras/errlog.alternate -d H

To display a detailed report of all errors logged for the error label ERRLOG_ON,
enter:
errpt -a -J ERRLOG_ON

To display a detailed report of all errors and group duplicate errors, enter:

errpt -aD
To display a detailed report of all errors logged for the error labels DISK_ERR1
and DISK_ERR2 during
the month of August, enter:
errpt -a -J DISK_ERR1,DISK_ERR2 -s 0801000004 -e 0831235904"

errclear:

Deletes entries in the error log

Example: errclear 0 (Truncates the errlog to 0 bytes)

Example errorreport:
--------------------

Example 1:
----------

P550:/home/reserve $ errpt

IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION


0EC00096 0130224507 P U SYSPFS STORAGE SUBSYSTEM FAILURE
0EC00096 0130224007 P U SYSPFS STORAGE SUBSYSTEM FAILURE
0EC00096 0130224007 P U SYSPFS STORAGE SUBSYSTEM FAILURE
0EC00096 0130223507 P U SYSPFS STORAGE SUBSYSTEM FAILURE
F7DDA124 0130223507 U H LVDD PHYSICAL VOLUME DECLARED MISSING
52715FA5 0130223507 U H LVDD FAILED TO WRITE VOLUME GROUP STATUS AREA
CAD234BE 0130223507 U H LVDD QUORUM LOST, VOLUME GROUP CLOSING
613E5F38 0130223507 P H LVDD I/O ERROR DETECTED BY LVM
613E5F38 0130223507 P H LVDD I/O ERROR DETECTED BY LVM
613E5F38 0130223507 P H LVDD I/O ERROR DETECTED BY LVM
0873CF9F 0130191907 T S pts/4 TTYHOG OVER-RUN
0EC00096 0130162407 P U SYSPFS STORAGE SUBSYSTEM FAILURE
51E537B5 0130161807 P H sysplanar0 platform_dump saved to file
291D64C3 0130161807 I H sysplanar0 platform_dump indicator event
291D64C3 0130161807 I H sysplanar0 platform_dump indicator event
BFE4C025 0130161807 P H sysplanar0 UNDETERMINED ERROR
51E537B5 0130161707 P H sysplanar0 platform_dump saved to file
291D64C3 0130161707 I H sysplanar0 platform_dump indicator event
291D64C3 0130161707 I H sysplanar0 platform_dump indicator event
51E537B5 0130161707 P H sysplanar0 platform_dump saved to file
291D64C3 0130161707 I H sysplanar0 platform_dump indicator event
291D64C3 0130161707 I H sysplanar0 platform_dump indicator event
BFE4C025 0130161607 P H sysplanar0 UNDETERMINED ERROR
BFE4C025 0130161407 P H sysplanar0 UNDETERMINED ERROR
BFE4C025 0130161307 P H sysplanar0 UNDETERMINED ERROR
BFE4C025 0130161307 P H sysplanar0 UNDETERMINED ERROR
BFE4C025 0130161207 P H sysplanar0 UNDETERMINED ERROR
BFE4C025 0130161207 P H sysplanar0 UNDETERMINED ERROR
0EC00096 0130161207 P U SYSPFS STORAGE SUBSYSTEM FAILURE
BFE4C025 0130161107 P H sysplanar0 UNDETERMINED ERROR
D2A1B43E 0130161107 P U SYSPFS FILE SYSTEM CORRUPTION
D2A1B43E 0130161107 P U SYSPFS FILE SYSTEM CORRUPTION
CD546B25 0130161107 I O SYSPFS FILE SYSTEM RECOVERY REQUIRED
CD546B25 0130161107 I O SYSPFS FILE SYSTEM RECOVERY REQUIRED
1ED0A744 0130161107 P U SYSPFS FILE SYSTEM LOGGING SUSPENDED
CD546B25 0130161107 I O SYSPFS FILE SYSTEM RECOVERY REQUIRED
D2A1B43E 0130161107 P U SYSPFS FILE SYSTEM CORRUPTION
1ED0A744 0130161107 P U SYSPFS FILE SYSTEM LOGGING SUSPENDED
F7DDA124 0130161107 U H LVDD PHYSICAL VOLUME DECLARED MISSING
52715FA5 0130161107 U H LVDD FAILED TO WRITE VOLUME GROUP STATUS AREA
CAD234BE 0130161107 U H LVDD QUORUM LOST, VOLUME GROUP CLOSING
613E5F38 0130161107 P H LVDD I/O ERROR DETECTED BY LVM
EAA3D429 0130161107 U S LVDD PHYSICAL PARTITION MARKED STALE
613E5F38 0130161107 P H LVDD I/O ERROR DETECTED BY LVM
613E5F38 0130161107 P H LVDD I/O ERROR DETECTED BY LVM
41BF2110 0130161107 U H LVDD MIRROR WRITE CACHE WRITE FAILED
613E5F38 0130161107 P H LVDD I/O ERROR DETECTED BY LVM
CAD234BE 0130161107 U H LVDD QUORUM LOST, VOLUME GROUP CLOSING
F7DDA124 0130161107 U H LVDD PHYSICAL VOLUME DECLARED MISSING
41BF2110 0130161107 U H LVDD MIRROR WRITE CACHE WRITE FAILED
613E5F38 0130161107 P H LVDD I/O ERROR DETECTED BY LVM
6472E03B 0130161107 P H sysplanar0 EEH permanent error for adapter
FEC31570 0130161107 P H sisscsia2 UNDETERMINED ERROR
C14C511C 0130161107 T H scsi5 ADAPTER ERROR
BFE4C025 0130161107 P H sysplanar0 UNDETERMINED ERROR
FE2DEE00 0130144307 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET
FE2DEE00 0130143207 P S SYSXAIXIF DUPLICATE IP ADDRESS DETECTED IN THE NET
B6048838 0129100507 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED
B6048838 0129100307 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED

You might create a script called alert.sh and call it from your .profile

#!/usr/bin/ksh
cd ~
rm -rf /root/alert.log
echo "Important alerts in errorlog: " >> /root/alert.log
errpt | grep -i STORAGE >> /root/alert.log
errpt | grep -i QUORUM >> /root/alert.log
errpt | grep -i ADAPTER >> /root/alert.log
errpt | grep -i VOLUME >> /root/alert.log
errpt | grep -i PHYSICAL >> /root/alert.log
errpt | grep -i STALE >> /root/alert.log
errpt | grep -i DISK >> /root/alert.log
errpt | grep -i LVM >> /root/alert.log
errpt | grep -i LVD >> /root/alert.log
errpt | grep -i UNABLE >> /root/alert.log
errpt | grep -i USER >> /root/alert.log
errpt | grep -i CORRUPT >> /root/alert.log
cat /root/alert.log

if [ `cat alert.log|wc -l` -eq 1 ]


then
echo "No critical errors found."
fi

echo " "


echo "Filesystems that might need attention, e.g. %used:"
df -k |awk '{print $4,$7}' |grep -v "Filesystem"|grep -v tmp > /tmp/tmp.txt
cat /tmp/tmp.txt | sort -n | tail -3

Example 2:
----------

IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION


173C787F 0710072007 I S topsvcs Possible malfunction on local adapter
90D3329C 0710072007 P S topsvcs NIM read/write error
AE3E3FAD 0710064907 I O SYSJ2 FSCK FOUND ERRORS
AE3E3FAD 0710064907 I O SYSJ2 FSCK FOUND ERRORS
AE3E3FAD 0710064907 I O SYSJ2 FSCK FOUND ERRORS
AE3E3FAD 0710064907 I O SYSJ2 FSCK FOUND ERRORS
AE3E3FAD 0710064907 I O SYSJ2 FSCK FOUND ERRORS
AE3E3FAD 0710064907 I O SYSJ2 FSCK FOUND ERRORS
AE3E3FAD 0710064907 I O SYSJ2 FSCK FOUND ERRORS
C1348779 0710061107 I O SYSJ2 LOG I/O ERROR
C1348779 0710061107 I O SYSJ2 LOG I/O ERROR
C1348779 0710061107 I O SYSJ2 LOG I/O ERROR
EAA3D429 0710061007 U S LVDD PHYSICAL PARTITION MARKED STALE

IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION


12337A8D 0723152107 T S DR_KER_MEM Affected memory not available for DR rem

Some notes on disk related errors:


----------------------------------

DISK_ERR4 is bad block relocation. Not a serious error.


DISK_ERR2 is a hardware error as opposed to a media or corrected read error on
disk. This is serious.
EAA3D429 0121151108 U S LVDD PHYSICAL PARTITION MARKED STALE

Note 1:
-------

thread 1:

Q:

Has anyone seen these errors before? We're running 6239 fc cards on a
CX600. AIX level is 52-03 with the latest patches for devices.pci.df1000f7
as well.

I didn't know that these adapters still used devices.pci.df1000f7 as part


of their device driver set, but aparently they do. We're mostly seeing
ERR4s on bootup and occassionaly throughout the day. They're TEMP but
should I be concerned about this? Any help would be greatly appreciated!

LABEL: SC_DISK_ERR4
IDENTIFIER: DCB47997

A:

DISK_ERR_4 are simply bad-block relocation errors. They are quite normal.
However, I heard that if you get more than 8 in an 8-hour period, you
should get the disk replaced as it is showing signs of impending failure.

thread 2:

Q:

> Has anyone corrected this issue? SC_DISK_ERR2 with EMC Powerpath =
> filesets listed below? I am using a CX-500.=20
>

A:

got those errors before using a CX700 and it turned out to be a


firmware problem on the fibre adapter, model 6259. EMC recommended the
92X1 firmware and to find out IBM found problems with timeouts to the
drives and recommended going back a level to 81X1.

A:

We have the same problem as well. EMC say its a firmware error on the
FC adapters

A:

This is how to fix these errors, downgrading firware is not recommended.

Correcting SCSI_DISK_ERR2's in the AIX Errpt Log - Navisphere Failover


Wizard
1. In the Navisphere main screen, select tools and then click the
Failover Setup Wizard. Click next to continue.

2. From the drop-down list select the host server you wish to
modify and click next

3. Highlight the CX-500 and click next

4. Under the specify settings box be sure to select 1 for the


failover setting and disable for array commpath. Click next to process.
5. The next screen is the opportunity to review your selections
(host, failover mode and array commpath); click next to commit
6. The following screen displays a warning message to alert you are
committing these changes. Click yes to process.

7. Next login to the AIX command prompt as root and perform the
following commands to complete stopping the SCSI_DISK_ERR2.
a. lsdev -Cc disk | grep LUNZ

(Filter for disks with LUNZ in the description)


b. rmdev -dl hdisk(#)'s

(Note the disks and remove them from the ODM)


c. errclear 0
(Clear the AIX system error log)
d. cfgmgr -v
(Attempt to re-add the LUNZ disks)
e. lsdev -Cc disk | grep LUNZ
(Double check to make sure the LUNZ disk does not add itself back to the
system after the cfgmgr command)
f. errpt -a

(Monitor the AIX error log to insure the SCSI_DISK_ERR2's are gone)
Task Complete...

E87EF1BE 0512150008 P O dumpcheck The largest dump device is too small.


------------------------------------------------------------------------------

Problems with errpt:


--------------------

Invalid log, or other problems

thread 1:

Q:

Hello ...

the 'errpt' Command tells me:

0315-180 logread: UNEXPECTED EOF 0315-171 Unable to process the error log file
/var/adm/ras/errlog. 0315-132 The supplied error log is not valid:
/var/adm/ras/errlog.

# ls -l /var/adm/ras/errlog
-rw-r--r-- 1 root system 0 Jun 14 17:31 /var/adm/ras/errlog

How can I fix this problem?

A:

/usr/lib/errstop # stop logging

rm /var/adm/ras/errlog # get rid of that log.

/usr/lib/errdemon # restart the daemon, creating a new error log.

diag command:
-------------

Whenever a hardware problem occurs in AIX, use the diag command to diagnose the
problem.

The diag command is the starting point to run a wide choice of tasks and service
aids.
Most of the tasks/service aids are platform specific.

To run diagnostics on the scdisk0 device, without questions, enter:

# diag -d scdisk0 -c

System dumps:
-------------

A system dump is created when the system has an unexpected system halt or system
failure.
In AIX 5L the default dump device is /dev/hd6, which is also the default paging
device.
You can use the sysdumpdev command to manage system crash dumps.

The sysdumpdev command changes the primary or secondary dump device designation in
a system that is running.
The primary and secondary dump devices are designated in a system configuration
object.
The new device designations are in effect until the sysdumpdev command is run
again, or the system is restarted.

If no flags are used with the sysdumpdev command, the dump devices defined in the
SWservAt
ODM object class are used. The default primary dump device is /dev/hd6. The
default secondary dump device is
/dev/sysdumpnull.

Examples
To display current dump device settings, enter:
sysdumpdev -l

To designate logical volume hd7 as the primary dump device, enter:


sysdumpdev -p /dev/hd7

To designate tape device rmt0 as the secondary dump device, enter:


sysdumpdev -s /dev/rmt0

To display information from the previous dump invocation, enter:


sysdumpdev -L

To permanently change the database object for the primary dump device to
/dev/newdisk1, enter:
sysdumpdev -P -p /dev/newdisk1

To determine if a new system dump exists, enter:


sysdumpdev -z

If a system dump has occurred recently, output similar to the following will
appear:

4537344 /dev/hd7
To designate remote dump file /var/adm/ras/systemdump on host mercury for a
primary dump device, enter:
sysdumpdev -p mercury:/var/adm/ras/systemdump

A : (colon) must be inserted between the host name and the file name.
To specify the directory that a dump is copied to after a system crash, if the
dump device is /dev/hd6, enter:
sysdumpdev -d /tmp/dump

This attempts to copy the dump from /dev/hd6 to /tmp/dump after a system crash. If
there is an error during the copy,
the system continues to boot and the dump is lost.
To specify the directory that a dump is copied to after a system crash, if the
dump device is /dev/hd6, enter:
sysdumpdev -D /tmp/dump

This attempts to copy the dump from /dev/hd6 to the /tmp/dump directory after a
crash. If the copy fails,
you are prompted with a menu that allows you to copy the dump manually to some
external media.

Starting a system dump:


-----------------------

If you have the Software Service Aids Package installed, you have access to the
sysdumpstart command.
You can start the system dump by entering:
# sysdumpstart -p

You can also use:


# smit dump

Notes regarding system dumps:


-----------------------------
note 1:
-------

The_Nail <[email protected]> wrote:


> I handle several AIX 5.1 servers and some of them warns me (via errpt)
> about a lack of disk space for the dumpcheck ressource.
> Here is a copy of the message :

>
> Description
> The copy directory is too small.
>
> Recommended Actions
> Increase the size of that file system.
>
> Detail Data
> File system name
> /var/adm/ras
>
> Current free space in kb
> 7636
> Current estimated dump size in kb
> 207872

> I guess /dev/hd6 is not big enough to contain a system dump. So how
> can i change that?

The error message tells you something else.


Read it, and you will understand!

> How can i configure a secondary susdump space in case the primary
> would be unavailable?

sysdumpdev -s /dev/whatever

> What does "copy directory /var/adm/ras" mean?

That's where the crash dump will be put when you reboot after the crash.
/dev/hd6 will be needed for other purposes (paging space), so you cannot
keep your system dump there.

And that file system is too small to contain the dump, that's the meaning
of the error message.

You have two options:

- increase the /var file system (it should have ample free space anyway).
- change the dump directory to something where you have more space:
sysdumpdev -D /something/in/rootvg/with/free/space

Yours,
Laurenz Albe

Note 2:
-------

Suppose you find the following error:

$ errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
F89FB899 0822150005 P O dumpcheck The copy directory is too small

This message is the result of a dump device check. You can fix this by
increasing the size of your dump device. If you are using the default
dump device (/dev/hd6) then increase your paging size or go to smit dump
and "select System Dump Compression". Myself, I don't like to use the
default dump device so I create a sysdumplv and make sure I have enough
space. To check space needed go to smit dump and select "Show Estimated
Dump Size" this will give you an idea about the size needed.

The copy directory is whatever sysdumpdev says it is.


Run sysdumpdev and you will get something like
#sysdumpdev
primary /dev/hd6
secondary /dev/sysdumpnull
copy directory /var/adm/ras
forced copy flag TRUE
always allow dump FALSE
dump compression ON

# sysdumpdev -e
0453-041 Estimated dump size in bytes: 57881395
Divide this number by 1024. This is the free space that is needed in
your copy directory. Compare it to a df -k or divide this number by
512. This is the free space that is needed in your copy directory.
Compare it to a df

HP:
---

22. Diagnostic output:


======================

0:Standard input 1: Standard output 2: Diagnostic output

redirect diag. outp. to file


# cat somefile nofile 2>errfile
# cat somefile nofile > outfile 2>errfile

redirect diag. outp. to same place as standard outp.


# cat firsthalf secondhalf > composite 2>1&
23. DOS2UNIX:
=============

If you want to convert a ascii PC file to unix, you can use many tools like tr
etc..

# tr -d '\015' < original.file > new.file

Or scripts like:

#!/bin/sh
perl -p -i -e 'BEGIN { print "Converting DOS to UNIX.\n" ; } END { print "Done.\n"
; } s/\r\n$/\n/' $*

perl -p -i.bak -e 's/^\r+//;s/\r+$//;s/\r/\n/gs' file

Or, on many unixes You can use the utility " dos2unix " to remove the ^M
Just type: dos2unix <filename1> <filename2> [RETURN]

dos2unix [ -ascii ] [ -iso ] [ -7 ] originalfile convertedfile

-ascii
Removes extra carriage returns and converts end of file characters in DOS format
text files to conform to SunOS requirements.
-iso
This is the default. It converts characters in the DOS extended character set to
the corresponding ISO standard characters.
-7
Convert 8 bit DOS graphics characters to 7 bit space characters so that SunOS can
read the file.

#!/bin/sh
# a script to strip carriage returns from DOS text files
if test -f $1
then
tr -d '\r' <$1 >$.tmp
rm $1
mv $.tmp $1
fi

# tr -d '\015' < original.file > new.file

Note: Other formats on AIX:


---------------------------

1. nvdmetoa command:

How to convert EBCDIC files to ASCII:

On your AIX system, the tool nvdmetoa might be present.

Examples:

nvdmetoa <AS400.dat >AIXver3.dat


Converts an EBCDIC file taken off an AS400 and converts to an ASCII file for the
pSeries or RS/6000

nvdmetoa 132 <AS400.txt >AIXver3.txt

Converts an EBCDIC file with a record length of 132 characters to an ASCII file
with 132 bytes per line
PLUS 1 byte for the linefeed character.

2. od command:

The od command translate a file into other formats, like for example hexadecimal
format.
To translate a file into several formats at once, enter:

# od -t cx a.out > a.xcd

This command writes the contents of the a.out file, in hexadecimal format (x) and
character format (c),
into the a.xcd file.

24. Secure shell connections:


=============================

ssh:
====

What is Open Secure Shell?

Open Secure Shell (OpenSSH) is an open source version of the SSH protocol suite of
network connectivity tools.
The tools provide shell functions that are authenticated and encrypted. A shell is
a command language interpreter
that reads input from a command line string, stdin or a file. Why use OpenSSH?
When you're running over
unsecure public networks like the Internet, you can use the SSH command suite
instead of the unsecure commands telnet,
ftp, and r-commands.

OpenSSH delivers code that communicates using SSH1 and SSH2 protocols. What's the
difference? The SSH2 protocol
is a re-write of SSH1. SSH2 contains separate, layered protocols, but SSH1 is one
large set of code. SSH2 supports
both RSA & DSA keys, but SSH1 supports only RSA, and SSH2 uses a strong crypto
integrity check, where SSH1 uses
a CRC-32 check. The Internet Engineering Task Force (IETF) maintains the secure
shell standards.

Example 1:
----------
Go to a terminal on your local Unix system (Solaris, Linux, Mac OS X, etc.) and
type the following command:

ssh -l username acme.gatech.edu

Replace "username" with your Prism ID. If this is your first time connecting to
acme, you will see
a warning similar to this:

The authenticity of host 'acme.gatech.edu (130.207.165.23)' can't be


established.
DSA key fingerprint is 72:ce:63:c5:86:3a:cb:8c:cb:43:6c:da:00:0d:4c:1f.
Are you sure you want to continue connecting (yes/no)?

Type the word "yes" and hit <ENTER>. You should see the following warning:

Warning: Permanently added 'acme.gatech.edu,130.207.165.23' (DSA) to the list of


known hosts.

Next, you will be prompted for your password. Type your password and hit <ENTER>.

Example 2:
----------

A secure shell 'terminal':

# ssh �l oracle 193.172.126.193


# ssh [email protected]

pscp:
=====

Example to Copy a file to a remote unix server:

# pscp c:\documents\foo.txt [email protected]:/tmp/foo

To receive (a) file(s) from a remote server:

pscp [options] [user@]host:source target


So to copy the file /etc/hosts from the server example.com as user fred to the
file c:\temp\example-hosts.txt,
you would type:

pscp [email protected]:/etc/hosts c:\temp\example-hosts.txt

To send (a) file(s) to a remote server:

pscp [options] source [source...] [user@]host:target


So to copy the local file c:\documents\foo.txt to the server example.com as user
fred to the file /tmp/foo you would type:

pscp c:\documents\foo.txt [email protected]:/tmp/foo

You can use wildcards to transfer multiple files in either direction, like this:

pscp c:\documents\*.doc [email protected]:docfiles


pscp [email protected]:source/*.c c:\source

Example of scripts using pscp with parameters;

------------------------------------
@echo off

REM Script om via pscp.exe een bestand van een UNIX systeem te copi�ren naar het
werkstation.

Echo Copy bestand van unix naar werkstation

SET /P systemname=Geef volledige systeemnaam:


SET /P remotefile=Geef UNIX path+filename:
SET /P localfile=Geef local filename:
SET /P username=Geef username:

echo pscp.exe %username%@%systemname%:%remotefile% %localfile%

pscp.exe %username%@%systemname%:%remotefile% %localfile%

echo bestand %remotefile% gecopieerd naar %localfile%


pause

------------------------------------

@echo off

REM Script om via pscp.exe een bestand naar een UNIX systeem te copi�ren van het
werkstation.

Echo Copy bestand van werkstation naar unix

SET /P systemname=Geef volledige systeemnaam:


SET /P localfile=Geef local filename:
SET /P remotefile=Geef UNIX path+filename:
SET /P username=Geef username:

echo pscp.exe %localfile% %username%@%systemname%:%remotefile%


pscp.exe %localfile% %username%@%systemname%:%remotefile%
echo bestand %localfile% gecopieerd naar %remotefile%
pause
------------------------------------

scp:
====

Scp is a utility which allows files to be copied between machines. Scp is an


updated version of an
older utility named Rcp. It works the same, except that information (including the
password used to log in)
is encrypted. Also, if you have set up your .shosts file to allow you to ssh
between machines
without using a password as described in help on setting up your .shosts file, you
will be able to scp
files between machines without entering your password.
Either the source or the destination may be on the remote machine; i.e., you may
copy files or directories
into the account on the remote system OR copy them from the account on the remote
system into the account
you are logged into.

Example:
# scp conv1.tar.gz [email protected]:/backups/520backups/splenvs
# scp conv2.tar.gz [email protected]:/backups/520backups/splenvs

Example:
# scp myfile xyz@sdcc7:myfile

Example:
To copy a directory, use the -r (recursive) option.
# scp -r mydir xyz@sdcc7:mydir

Example:
cd /oradata/arc
/usr/local/bin/scp *.arc SPRAT:/oradata/arc

Example:
While logged into xyz on sdcc7, copy file "letter" into file "application" in
remote account abc on sdcc3:
% scp letter abc@sdcc3:application

While logged into abc on sdcc3, copy file "foo" from remote account xyz on sdcc7
into filename "bar" in abc:
% scp xyz@sdcc7:foo bar

To permit a connection (ssh or scp) from a local machine to a remote machine


without always
typing a password, on the remote machine, create the file ".shosts" in your home
that contains
the name of the local machine. The permissions on "e;.shosts"e; should be rw for
the user and
--- for everyone else (The command chmod 600 .shosts will set the permissions
correctly). If you have
the file ".rhosts", please delete it.
SSH and SCP will use the ssh_know_hosts file. If the local machine is correctly
entered in the user's
.ssh/known_hosts file, then the connection will be permitted with out a password.

To make this work, you may need to log back in from the remote machine to your
local machine.
For example, if your local machine is i7.msi.umn.edu and you want to connect to
origin.msi.umn.edu,
use the following procedure to set up connecting from i7 to origin without a
password:

Estiblish an ssh connection to origin:


ssh -X origin.msi.umn.edu

After typing a password and establishing a connection, Add i7.msi.umn.edu to the


file "e;.shosts"e;
in your home directory.
Extablish an ssh connection back to i7.msi.umn.edu.
ssh -X i7.msi.umn.edu

After typing a password on i7, you can exit from i7.

ssh on AIX:
===========

After you download the OpenSSL package, you can install OpenSSL and OpenSSH.

Install the OpenSSL RPM package using the geninstall command:

# geninstall -d/dev/cd0 R:openssl-0.9.6m

Output similar to the following displays:


SUCCESSES
---------
openssl-0.9.6m-3

Install the OpenSSH installp packages using the geninstall command:


# geninstall -I"Y" -d/dev/cd0 I:openssh.base

Use the Y flag to accept the OpenSSH license agreement after you have reviewed the
license agreement.
(Note: we have seen this line as well:
# geninstall -Y -d/dev/cd0 I:openssh.base)

Output similar to the following displays:

Installation Summary
--------------------
Name Level Part Event Result
-------------------------------------------------------------------------------
openssh.base.client 3.8.0.5200 USR APPLY SUCCESS
openssh.base.server 3.8.0.5200 USR APPLY SUCCESS
openssh.base.client 3.8.0.5200 ROOT APPLY SUCCESS
openssh.base.server 3.8.0.5200 ROOT APPLY SUCCESS

You can also use the SMIT install_software fast path to install OpenSSL and
OpenSSH.

The following OpenSSH binary files are installed as a result of the preceding
procedure:

scp File copy program similar to rcp


sftp Program similar to FTP that works over SSH1 and SSH2 protocol
sftp-server SFTP server subsystem (started automatically by sshd daemon)
ssh Similar to the rlogin and rsh client programs
ssh-add Tool that adds keys to ssh-agent
ssh-agent An agent that can store private keys
ssh-keygen Key generation tool
ssh-keyscan Utility for gathering public host keys from a number of hosts
ssh-keysign Utility for host-based authentication
ssh-rand-helper A program used by OpenSSH to gather random numbers. It is used
only on AIX 5.1 installations.
sshd Daemon that permits you to log in
The following general information covers OpenSSH:
The /etc/ssh directory contains the sshd daemon and the configuration files for
the ssh client command.
The /usr/openssh directory contains the readme file and the original OpenSSH open-
source license text file.
This directory also contains the ssh protocol and Kerberos license text.

The sshd daemon is under AIX SRC control. You can start, stop, and view the status
of the daemon
by issuing the following commands:

startsrc -s sshd OR startsrc -g ssh (group)


stopsrc -s sshd OR stopsrc -g ssh
lssrc -s sshd OR lssrc -s ssh

More on ssh-keygen:
===================

ssh-keygen: password-less SSH login


SSH is often used to login from one system to another without requiring passwords.

A number of methods may be used for that to work properly, one of which is to
setup a
.rhosts file (permission 600) with its content being the name of the remote system
you trust,
followed by the username your trust:

nickel.sao.nrc.ca cantin

would mean you trust user cantin from nickel.sao.nrc.ca to connect to your
account,
without requiring a password.
But for that to work, SSH itself must be configured to trust .rhosts files (which
it does not
for most OpenSSH installations - but we do on most systems RCSG maintains), and
the private/public key pair
of each system must be properly set in the system-wide ssh_known_hosts public key
file.

This, of course, requires help from the local systems administrator.

The second method does not require any help from the systems administrator. And it
does not require modifications
to the .rhosts file. Instead, it requires you generate your own personal set of
private/public pair.

ssh-keygen is used to generate that key pair for you. Here is a session where your
own personal
private/public key pair is created:

cantin@sodium:~> ssh-keygen -t rsa


Generating public/private rsa key pair.
Enter file in which to save the key (/home/cantin/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/cantin/.ssh/id_rsa.
Your public key has been saved in /home/cantin/.ssh/id_rsa.pub.
The key fingerprint is:
f6:61:a8:27:35:cf:4c:6d:13:22:70:cf:4c:c8:a0:23 cantin@sodium

The command ssh-keygen -t rsa initiated the creation of the key pair.

No passphrase was entered (Enter key was pressed instead).

The private key was saved in .ssh/id_rsa. This file is read-only and only for you.
No one else must
see the content of that file, as it is used to decrypt all correspondence
encrypted with the public key.

The public key is save in .ssh/id_rsa.pub.

In this case, the content of file id_rsa.pub is

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEArkwv9X8eTVK4F7pMlSt45pWoiakFkZMw
G9BjydOJPGH0RFNAy1QqIWBGWv7vS5K2tr+EEO+F8WL2Y/jK4ZkUoQgoi+n7DWQVOHsR
ijcS3LvtO+50Np4yjXYWJKh29JL6GHcp8o7+YKEyVUMB2CSDOP99eF9g5Q0d+1U2WVdB
WQM= cantin@sodium

It is one line in length.

Its content is then copied in file .ssh/authorized_keys of the system you wish to
SSH to without
being prompted for a password.

The example shown here generated keys on sodium by user cantin. If the public key
generated,
file .ssh/id_rsa.pub, was copied to your account, file .ssh/authorized_keys on
nickel.sao.nrc.ca,
then user cantin@sodium is allowed to SSH into your own account on
nickel.sao.nrc.ca without
the use of a password.

To summarize, a personal private/public key pair is generated using the ssh-keygen


command.
The public key is then copied onto a remote systems' .ssh/authorized_keys file.
And you can now SSH to the remote systems's account without the use of a password.

Example:
--------

The backup user bu520 on a p520, needs to copy backupfiles to a p550.


The process is a cronjob which uses scp. The user should not be
confronted with a pasword entry.

On p520:

/home/bu520/.ssh:>ls -al
total 7
drwx------ 2 bu520 staff 512 Apr 24 2006 .
drwxr-xr-x 3 bu520 staff 512 Apr 24 2006 ..
-rw------- 1 bu520 staff 883 Apr 24 2006 id_rsa
-rw-r--r-- 1 bu520 staff 225 Apr 24 2006 id_rsa.pub
-rw-r--r-- 1 bu520 staff 663 Jun 01 2006 known_hosts
/home/bu520/.ssh:>cat id_rsa
-----BEGIN RSA PRIVATE KEY-----
MIICWgIBAAKBgQCq901MXZ+l+QFUkyLUgPskqEYz11eGR0nFr0ydVsUDrAnAQngE
BGNyrURqGxC+vA2dhU1kdeDLa6PlrxrQ9j02hpcG4mSO369BzJ3QEg9C4yPnHxfJ
L9/GauVRzgY3WjmCzwAm51GOsW6S/1s9SQWDG4uepvuUTasIZgf3fktcKQIBIwKB
gQCNqFX9ciUxv7ClKXShci8tAHSunHu4ZvP7kT97DWFpcUnocZakPiaDluDqM67J
7EXLqPb7d50AUd+SbIPu9+mSOTrkXSBII+eVzMIM8yJKgy8+nrsctDE3vw/ZGb+l
Gf8R6zwd2YR0Y2LBS0RSP5DNgf4B6FZO9o+VGTjMlvYkiwJBANfwcJL5G9EQmQkO
zzVhkX4N/oXN3LmmbI9+QMPHhbXiXj2J0sqchx/gir+hcPo9PsRq5gHgtO2Hr+qS
sAFWAMkCQQDKrvV1GFnIzcfVQ7Nwnso5hJ0F2tt5cLV5OXTz/x9Y09n5+M77tBEr
QvunF+Sg9jHUuTHtzTCgfuJUMLqAJJBhAkB1OWGu3wB4zn72Sd4y69Kjg/CRx4Zz
aPkaskBqR72dQF8LdrRCGnU9MMBZZkSlGe7fp76wj+0wfNvXHG4snGbTAkAXKfAq
o7J9WViqqKbLCtVIZu1fwT2nephloCqfit8C1mIN8IyvDUPKbg4huZZ4y63sbO/D
Z+hM200Q76BJKMALAkB/ocrU8gkAiTBqanu0HR8bsLpIQRM+bAohXc2+wGSOFeZG
ZijMWsvl+FDtLWcFgEi3fB6dR86YSax5VFLhsLIL
-----END RSA PRIVATE KEY-----

/home/bu520/.ssh:>cat id_rsa.pub
ssh-rsa
AAAAB3NzaC1yc2EAAAABIwAAAIEAqvdNTF2fpfkBVJMi1ID7JKhGM9dXhkdJxa9MnVbFA6wJwEJ4BARjcq
1EahsQvrwNnYVNZHXgy2uj5a8a0PY9NoaXBuJkjt+vQcyd0BIPQuMj5x8XyS/fxmrlUc4GN1o5gs8AJudR
jrFukv9bPUkFgxuLnqb7lE2rCGYH935LXCk= bu520@ol116u209

/home/bu520/.ssh:>cat id_rsa.pub
ssh-rsa
AAAAB3NzaC1yc2EAAAABIwAAAIEAqvdNTF2fpfkBVJMi1ID7JKhGM9dXhkdJxa9MnVbFA6wJwEJ4BARjcq
1EahsQvrwNnYVNZHXgy2uj5a8a0PY9NoaXBuJkjt+vQcyd0BIPQuMj5x8XyS/fxmrlUc4GN1o5gs8AJudR
jrFukv9bPUkFgxuLnqb7lE2rCGYH935LXCk= bu520@ol116u209

/home/bu520/.ssh:>cat known_hosts
192.168.2.2 ssh-rsa
AAAAB3NzaC1yc2EAAAABIwAAAIEAx16h52LfGNbf5VIn4zDsIWSnFm668YZ3k2immcyA+ih5RRohh9f+Z8
lS9EFDvnNQsTLMwduPBpjXPZY3mZXOVDtpsu6rnKCWKNx9DFaxsLtBSk+1tV4Yr1u7nO6hxs/2vE5xwWys
5qQP0XABJ/m0+eY8IYMkE/LeXXw0to8iz7c=
192.168.2.3 ssh-rsa
AAAAB3NzaC1yc2EAAAABIwAAAIEAzSFdlVb+RyI5k3pWcpsP0oMcAhMgmb7g/GKLfOyAtf1+c+MeVADz3j
JzZywDKvzAJ+o409nhDSIuqvuoRQ2wva08jrPh16ewnSfGzjWY0n9aAMztMwWIvEXodowBNJVSBGV4SZdg
tzqauQ06H22dl0vORdie0/4M5OHYYbV2lxE=
192.168.1.2 ssh-rsa
AAAAB3NzaC1yc2EAAAABIwAAAIEAx16h52LfGNbf5VIn4zDsIWSnFm668YZ3k2immcyA+ih5RRohh9f+Z8
lS9EFDvnNQsTLMwduPBpjXPZY3mZXOVDtpsu6rnKCWKNx9DFaxsLtBSk+1tV4Yr1u7nO6hxs/2vE5xwWys
5qQP0XABJ/m0+eY8IYMkE/LeXXw0to8iz7c=

Automatic startup of sshd on boot:


----------------------------------

For example, on AIX create the following script "Sssh" in /etc/rc.d/rc2.d

root@zd110l14:/etc/rc.d/rc2.d#cat Ssshd
#!/bin/ksh

##################################################
# name: Ssshd
# purpose: script that will start or stop the sshd daemon.
##################################################

case "$1" in
start )
startsrc -g ssh
;;
stop )
stopsrc -g ssh
;;
* )
echo "Usage: $0 (start | stop)"
exit 1
esac

25. Pipelining and Redirecting:


===============================

CONCEPT: UNIX allows you to connect processes, by letting the standard output of
one process feed into the
standard input of another process. That mechanism is called a pipe.
Connecting simple processes in a pipeline allows you to perform complex tasks
without writing complex programs.

EXAMPLE: Using the more command, and a pipe, send the contents of your .profile
and .shrc files to the
screen by typing

cat .profile .shrc | more


to the shell.

EXERCISE: How could you use head and tail in a pipeline to display lines 25
through 75 of a file?

ANSWER: The command

cat file | head -75 | tail -50

would work. The cat command feeds the file into the pipeline. The head command
gets the first 75 lines
of the file, and passes them down the pipeline to tail. The tail command then
filters out all but the last
50 lines of the input it received from head. It is important to note that in the
above example, tail never
sees the original file, but only sees the part of the file that was passed to it
by the head command.
It is easy for beginners to confuse the usage of the input/output redirection
symbols < and >, with the
usage of the pipe. Remember that input/output redirection connects processes with
files, while the pipe connects
processes with other processes.

Grep
The grep utility is one of the most useful filters in UNIX. Grep searches line-by-
line for a specified pattern,
and outputs any line that matches the pattern. The basic syntax for the grep
command is
grep [-options] pattern [file]. If the file argument is omitted, grep will read
from standard input.
It is always best to enclose the pattern within single quotes, to prevent the
shell
from misinterpreting the command.

The grep utility recognizes a variety of patterns, and the pattern specification
syntax was taken from the
vi editor. Here are some of the characters you can use to build grep expressions:

The carat (^) matches the beginning of a line.


The dollar sign ($) matches the end of a line.
The period (.) matches any single character.
The asterisk (*) matches zero or more occurrences of the previous character.
The expression [a-b] matches any characters that are lexically between a and b.

EXAMPLE: Type the command

grep 'jon' /etc/passwd

to search the /etc/passwd file for any lines containing the string "jon".

EXAMPLE: Type the command

grep '^jon' /etc/passwd


to see the lines in /etc/passwd that begin with the character string "jon".

EXERCISE:List all the files in the /tmp directory owned by the user root.

EXPLANATION: The command

ls -l /tmp | grep 'root'


would show all processes with the word "root" somewhere in the line. That doesn't
necessarily mean that
all the process would be owned by root, but using the grep filter can cut the down
the number of processes
you will have to look at.

Redirecting:
------------

CONCEPT: Every program you run from the shell opens three files: Standard input,
standard output,
and standard error. The files provide the primary means of communications between
the programs,
and exist for as long as the process runs.

The standard input file provides a way to send data to a process. As a default,
the standard input is read
from the terminal keyboard.

The standard output provides a means for the program to output data. As a default,
the standard output
goes to the terminal display screen.
The standard error is where the program reports any errors encountered during
execution.
By default, the standard error goes to the terminal display.

CONCEPT: A program can be told where to look for input and where to send output,
using input/output
redirection. UNIX uses the "less than" and "greater than" special characters (<
and >) to signify input
and output redirection, respectively.

Redirecting input
Using the "less-than" sign with a file name like this:
< file1

in a shell command instructs the shell to read input from a file called "file1"
instead of from the keyboard.

EXAMPLE:Use standard input redirection to send the contents of the file


/etc/passwd to the more command:

more < /etc/passwd

Many UNIX commands that will accept a file name as a command line argument, will
also accept input from
standard input if no file is given on the command line.

EXAMPLE: To see the first ten lines of the /etc/passwd file, the command:

head /etc/passwd
will work just the same as the command:
head < /etc/passwd

Redirecting output
Using the "greater-than" sign with a file name like this:
> file2
causes the shell to place the output from the command in a file called "file2"
instead of on the screen.
If the file "file2" already exists, the old version will be overwritten.

EXAMPLE: Type the command

ls /tmp > ~/ls.out

to redirect the output of the ls command into a file called "ls.out" in your home
directory.
Remember that the tilde (~) is UNIX shorthand for your home directory. In this
command, the ls command
will list the contents of the /tmp directory.
Use two "greater-than" signs to append to an existing file. For example:

>> file2

causes the shell to append the output from a command to the end of a file called
"file2". If the file
"file2" does not already exist, it will be created.
EXAMPLE: In this example, I list the contents of the /tmp directory, and put it in
a file called myls.
Then, I list the contents of the /etc directory, and append it to the file myls:

ls /tmp > myls


ls /etc >> myls

Redirecting error
Redirecting standard error is a bit trickier, depending on the kind of shell
you're using
(there's more than one flavor of shell program!). In the POSIX shell and ksh,
redirect the standard error
with the symbol "2>".

EXAMPLE: Sort the /etc/passwd file, place the results in a file called foo, and
trap any errors in a file
called err with the command:

sort < /etc/passwd > foo 2> err

===========================
27. UNIX DEVICES and mknod:
===========================

27.1 Note 1:
============

the files in the /dev directory are a little different from anything you may be
used to in
other operating systems.
The very first thing to understand is that these files are NOT the drivers for the
devices. Drivers are in
the kernel itself (/unix etc..), and the files in /dev do not actually contain
anything at all:
they are just pointers to where the driver code can be found in the kernel. There
is nothing more to it
than that. These aren't programs, they aren't drivers, they are just pointers.

That also means that if the device file points at code that isn't in the kernel,
it obviously is not
going to work. Existence of a device file does not necessarily mean that the
device code is in the kernel,
and creating a device file (with mknod) does NOT create kernel code.

Unix actually even shows you what the pointer is. When you do a long listing of a
file in /dev,
you may have noticed that there are two numbers where the file size should be:

brw-rw-rw- 2 bin bin 2, 64 Dec 8 20:41 fd0

That "2,64" is a pointer into the kernel. I'll explain more about this in a
minute,
but first look at some more files:
brw-rw-rw- 2 bin bin 2, 64 Dec 8 20:41 fd0
brw-rw-rw- 2 bin bin 2, 48 Sep 15 16:13 fd0135ds15
brw-rw-rw- 2 bin bin 2, 60 Feb 12 10:45 fd0135ds18
brw-rw-rw- 1 bin bin 2, 16 Sep 15 16:13 fd0135ds21
brw-rw-rw- 2 bin bin 2, 44 Sep 15 16:13 fd0135ds36
brw-rw-rw- 3 bin bin 2, 36 Sep 15 16:13 fd0135ds9

A different kind of device would have a different major number. For example, here
are the serial com ports:

crw-rw-rw- 1 bin bin 5,128 Feb 14 05:35 tty1A


crw-rw-rw- 1 root root 5, 0 Dec 9 13:13 tty1a
crw-rw-rw- 1 root sys 5,136 Nov 25 07:28 tty2A
crw-r--r-- 1 uucp sys 5, 8 Nov 25 07:16 tty2a

Notice the "b" and the "c" as the first characters in the mode of the file. It
designates whether
we have a block "b", or a character "c" device.

Notice that each of these files shares the "5" part of the pointer, but that the
other number is different.
The "5" means that the device is a serial port, and the other number tells exactly
which com port you are
referring to. In Unix parlance, the 5 is the "major number" and the other is the
"minor number".

These numbers get created with a "mknod" command. For example, you could type
"mknod /dev/myfloppy b 2 60" and
then "/dev/myfloppy" would point to the same driver code that /dev/fd0135ds18
points to, and it would
work exactly the same.

This also means that if you accidentally removed /dev/fd0135ds18, you could
instantly recreate it with "mknod".

But if you didn't know that the magic numbers were "2,60", how could you find out?

It turns out that it's not hard.

First, have a look at "man idmknod". The idmknod command wipes out all non-
required devices, and then recreates them.
Sounds scary, but this gets called every time you answer "Y" to that "Rebuild
Kernel environment?" question that
follows relinking. Actually, on 5.0.4 and on, the existing /dev files don't get
wiped out; the command simply
recreates whatever it has to.

idmknod requires several arguments, and you'd need to get them right to have
success. You could make it easier
by simply relinking a new kernel and answering "Y" to the "Rebuild" question, but
that's using a fire hose to
put out a candle.

A less dramatic method would be to look at the files that idmknod uses to recreate
the device nodes. These are found
in /etc/conf/node.d
In this case, the file you want would be "fd". A quick look at part of that shows:

fd fd0 b 64 bin bin 666


fd fd0135ds36 b 44 bin bin 666
fd fd0135ds21 b 16 bin bin 666
fd fd0135ds18 b 60 bin bin 666
fd fd0135ds15 b 48 bin bin 666
fd fd0135ds9 b 36 bin bin 666
fd fd048 b 4 bin bin 666

This gives you *almost* everything you need to know about the device nodes in the
"fd" class. The only thing it
doesn't tell you is the major number, but you can get that just by doing an "l" of
any other fd entry:

brw-rw-rw- 1 bin bin 2, 60 Feb 5 09:45 fd096ds18

this shows you that the major number is "2".

Armed with these two pieces of information, you can now do

mknod /dev/fd0135ds18 b 2 60
chown bin /dev/fd0135ds18
chgrp bin /dev/fd0135ds18
chmod 666 /dev/fd0135ds18

If you examined the node file closely, you would also notice that /dev/rfd0135ds18
and /dev/fd0135ds18 differ only
in that the "r" version is a "c" or character device and the other is "b" or
block. If you had already known that,
you wouldn't have even had to look at the node file; you'd simply have looked at
an "l" of the /dev/rfd0135ds18 and
recreated the block version appropriately.

There are other fascinating things that can be learned from the node files. For
example, fd096ds18 is also minor number 60,
and can be used in the same way with identical results. In other words, if you z'd
out (were momentarily innattentive,
not CTRL-Z in a job control shell) and dd'd an image to /dev/fd096ds18, it would
write to your hd floppy without incident.

If you have a SCSI tape drive, notice what happens when you set it to be the
"default" tape drive.
It creates device files that have different names (rct0, etc.) but that have the
same major and minor numbers.

Knowing that it's easy to recreate missing device files also means that you can
sometimes capture the output
of programs that write directly to a device. For example, suppose some application
prints directly to /dev/lp
but you need to capture this to a file. In most situations, you can simply "rm
/dev/lp" (after carefully noting
its current ownership, permissions and, of course, major/minor numbers), and then
"touch /dev/lp" to create an
ordinary file. You'll need to chmod it for appropriate permissions, and then run
your app. Unless the app has
tried to do ioctl calls on the device, the output will be there for your use. This
can be particularly useful
for examining control characters that the app is sending.

What's the Difference?


One question that comes up fairly often is "what's the difference between a block
and a character device and when
should I use one rather than the other?". To answer that question fully is hard,
but I'm going to try to at least
get you started here.

The real difference lies in what the kernel does when a device file is accessed
for reading or writing. If the device
is a block device, the kernel gives the driver the address of a kernel buffer that
the driver will use as the source
or destination for data. Note that the address is a "kernel" address; that's
important because that buffer will be
cached by the kernel. If the device is raw , then the address it will use is in
the user space of the process that is
using the device. A block device is something you could make a filesystem on (a
disk). You can move forward and backward,
from the beginning of a block device to its end, and then back to the beginning
again. If you ask to read a block that
the kernel has buffered, then you get data from the buffer. If you ask for a block
that has not yet been buffered,
the kernel reads that block (and probably a few more following it) into the buffer
cache. If you write to a block device,
it goes to the buffer cache (eventually to the device, of course). A raw (or
character) device is often something that
doesn't have a beginning or end; it just gives a stream of characters that you
read. A serial port is an excellent
example- however, it is not at all unusual to have character (raw) drivers for
things that do have a beginning
and an end- a tape drive, for example. And many times there are BOTH character and
block devices for the same
physical device- disks, for example. Nor does using a raw device absolutely mean
that you can't move forward and back,
from beginning to end- you can move wherever you want with a tape or /dev/rfd0.

And that's where the differences get confusing. It seems pretty reasonable that
you'd use the block device to mount
a disk. But which do you use for format? For fsck? For mkfs?

Well, if you try to format /dev/fd0135ds18, you'll be told that it is not a


formattable device.
Does that make any sense? Well, the format process involves sequential access- it
starts at the beginning and just
keeps on going, so it seems to make sense that it wouldn't use the block device.
But you can run "mkfs" on either
the block or character device; it doesn't seem to care. The same is true for fsck.
But although that's true for those
programs on SCO OSR5, it isn't necessarily going to be true on some other UNIX,
and the "required" device may make sense
to whover wrote the program, but it may not make sense to you.

You'd use a block device when you want to take advantage of the caching provided
by the kernel. You'd use the raw device
when you don't, or for ioctl operations like "tape status" or "stty -a".
27.2 Note 2:
============

One of the unique things about Unix as an operating system is that regards
everything as a file. Files can be divided into
three categories; ordinary or plain files, directories, and special or device
files.

Directories in Unix are properly known as directory files. They are a special type
of file that holds a list of the
other files they contain.

Ordinary or plain files in Unix are not all text files. They may also contain
ASCII text, binary data, and program input
or output. Executable binaries (programs) are also files, as are commands. When a
user enters a command, the associated
file is retrieved and executed. This is an important feature and contributes to
the flexibility of Unix.

Special files are also known as device files. In Unix all physical devices are
accessed via device files; they are
what programs use to communicate with hardware. Files hold information on
location, type, and access mode for a
specific device. There are two types of device files; character and block, as well
as two modes of access.

- Block device files are used to access block device I/O. Block devices do
buffered I/O, meaning that the the data is
collected in a buffer until a full block can be transfered.

- Character device files are associated with character or raw device access. They
are used for unbuffered data transfers
to and from a device. Rather than transferring data in blocks the data is
transfered character by character.
One transfer can consist of multiple characters.

So what about a device that could be accessed in character or block mode? How many
device files would it have?

One.
Two.
There are no such devices.

Some devices, such as disk partitions, may be accessed in block or character mode.
Because each device file corresponds
to a single access mode, physical devices that have more than one access mode will
have more than one device file.

Device files are found in the /dev directory. Each device is assigned a major and
minor device number. The major
device number identifies the type of device, i.e. all SCSI devices would have the
same number as would all the keyboards.
The minor device number identifies a specific device, i.e. the keyboard attached
to this workstation.
Device files are created using the mknod command. The form for this command is:

mknod device-name type major minor

device-name is the name of the device file


type is either "c" for character or "b" for block
major is the major device number
minor is the minor device number
The major and minor device numbers are indexed to device switches. There are two
types of device switches; c
devsw for character devices and bdevsw for block devices. These switches are
kernel structures that hold the names
of all the control routines for a device and tell the kernel which driver module
to execute. Device switches are
actually tables that look something like this:

0 keyboard
1 SCSIbus
2 tty
3 disk
Using the ls command in the /dev directory will show entries that look like:

brw-r----- 1 root sys 1, 0 Aug 31 16:01 /dev/sd1a

The "b" before the permissions indicates that this is a block device file. When a
user enters /dev/sd1a the kernel sees
the file opening, realizes that it's major device number 1, and calls up the
SCSIbus function to handle it.

====================
28. Solaris devices:
====================

Devices are described in three ways in the Solaris environment, using three
distinct naming
conventions: the physical device name, the instance name, and the logical device
name.

Solaris stores the entries for physical devices under the /devices directory,
and the logical device entries behind the /dev directory.

- A "physical device name" represents the full pathname of the device.


Physical device files are found in the /devices directory and have a
naming convention like the following example:

/devices/sbus@1,f8000000/esp@0,40000/sd@3,0:a

Each device has a unique name representing both the type of device and the
location of that device
in the system-addressing structure called the "device tree". The OpenBoot
firmware builds the
device tree for all devices from information gathered at POST. The device tree
is loaded in memory
and is used by the kernel during boot to identify all configured devices.
A device pathname is a series of node names separated by slashes.
Each device has the following form:
driver-name@unit-address:device-arguments

- The "instance name" represents the kernel's abbreviated name for every possible
device
on the system. For example, sd0 and sd1 represents the instance names of two
SCSI disk devices.
Instance names are mapped in the /etc/path_to_inst file, and are displayed by
using the
commands dmesg, sysdef, and prtconf

- The "Logical device names" are used with most Solaris file system commands to
refer to devices.
Logical device files in the /dev directory are symbolically linked to physical
device files
in the /devices directory. Logical device names are used to access disk devices
in the
following circumstances:
- adding a new disk to the system and partitioning the disk
- moving a disk from one system to another
- accessing or mounting a file system residing on a local disk
- backing up a local file system
- repairing a file system

Logical devices are organized in subdirs under the /dev directory by their
device types
/dev/dsk block interface to disk devices
/dev/rdsk raw or character interface to disk devices.
In commands, you mostly use raw logical devices, like for example #
newfs /dev/rdsk/c0t3d0s7
/dev/rmt tape devices
/dev/term serial line devices
etc..

Logical device files have a major and minor number that indicate device drivers,

hardware addresses, and other characteristics.


Furthermore, a device filename must follow a specific naming convention.
A logical device name for a disk drive has the following format:

/dev/[r]dsk/cxtxdxsx

where cx refers to the SCSI controller number, tx to the SCSI bus target number,
dx to the disk number (always 0 except on storage arrays)
and sx to the slice or partition number.

===========================
29. filesystems in Solaris:
===========================

Checks on the filesystems in Solaris:


-------------------------------------

1. used space etc..


# df -k, df -h etc..

# du -ks /home/fred

Shows only a summary of the disk usage of the /home/fred subdirectory (measured in
kilobytes).

# du -ks /home/fred/*

Shows a summary of the disk usage of each subdirectory of /home/fred (measured in


kilobytes).

# du -s /home/fred

Shows a total summary of /home/fred

# du -sg /data

Shows a total summary of /data in GB

This command shows the diskusage of /dirname in GB


# du -g /dirname

2. examining the disklabel


# prtvtoc /dev/rdisk/c0t3d0s2

3. format just by itself shows the disks


# format

# format -> specify disk -> choose partition -> choose print to get the partition
table

4. Display information about SCSI devices

# cfgadm -al

or, from the PROM, commands like probe-scsi

Recovering disk partition information in Solaris:


-------------------------------------------------

Use the fmthard command to write the backup VTOC information back to the disk.
The following example uses the fmthard command to recover a corrupt label on a
disk
named /dev/rdisk/c0t3d0s1. The backup VTOC information is in a file named c0t3d0
in the /vtoc directory.

# fmthard -s /vtoc/c0t3d0s0 /dev/rdsk/c0t3d0s2

Remember that the format of /dev/(r)dsk/cWtXdYsZ means:

W is the controller number,


X is the SCSI target number,
Y is the logical unit number (LUN, almost always 0),
Z is the slice or partition number
Make a new filesystem in Solaris:
---------------------------------

To create a UFS filesystem on a formatted disk that already has been divided into
slices
you need to know the raw device filename of the slice that will contain the
filesystem.
Example:

# newfs /dev/rdsk/c0t3d0s7

defaults on UFS on Solaris:


blocksize 8192
fragmentsize 1024
one inode for each 2K of diskspace

FSCK in Solaris:
----------------

If you just want to determine the state of a filesystem, whether it needs


checking,
you can use the fsck command while the fs is mounted.
Example:

# fsck -m /dev/rdsk/c0t0d0s6

The state flag in the superblock of the filesystem you specify is checked to see
whether the filesystem is clean or requires checking.

If you ommit the device argument, all the filesystems listed in /etc/vfstab with
a fsck
pass value greater than 0 are checked.

Adding a disk in Solaris 2.6, 2.7, 8, 9, 10:


--------------------------------------------

In case you have just build in a new disk,


its probably best, to first use the probe-scsi command from the OK prompt:

ok probe-scsi
..
Target 3
Unit 0 Disk Seagate ST446452W 0001
ok boot -r

Spcifying the -r flag when booting, tells Solaris to reconfigure itself by


scanning
for new hardware.
Once the system is up, check the output for "dmesg" to find kernel messages
relating
to the new disk.
You probably find complaints telling you stuff as "corrupt label - wrong magic
number" etc..
That's good, because we now know that the kernel is aware of this new disk.

In this example, our disk is SCSI target 3, so we can refer to the whole disks as
/dev/rdsk/c0t3d0s2 # slice 2, or partition 2, s2 refers to the whole
disk

Remember that the format of /dev/(r)dsk/cWtXdYsZ means:

W is the controller number,


X is the SCSI target number,
Y is the logical unit number (LUN, almost always 0),
Z is the slice or partition number

We now use the format program to partition the disk, and afterwards create
filesystems.

# format /dev/rdsk/c0t3d0s2
(.. output..)
FORMAT MENU:

format>label
Ready to label disk, continue? y

format>partition
PARTITION MENU:

partition>

Once you have created and sized the partitions, you can get a list with the
"partition>print" command.

Now, for example, you can create a filesystem like in the following command:

# newfs /dev/rdsk/c0t3d0s0

================
30. AIX devices:
================

In AIX 5.x, the device configuration information is stored in the ODM repository.
The corresponding files
are in

/etc/objrepos
/usr/lib/objrepos
/usr/share/lib/objrepos

There are 2 sections in ODM:


- predefined: all of the devices in principle supported by the OS
- customized: all devices already configured in the system

Every device in ODM has a unique definition that is provided by 3 attributes:

1. Type
2. Class
3. Subclass
Information thats stored in the ODM:

- PdDv,PdAt, PdCn : Predefined device information


- CuDv, CuAt, CuDep : Customized device information
- lpp, inventory : Software vital product data
- smit menu's
- Error log, alog, and dump information
- System Resource Controller: SRCsubsys, SRCsubsrv
- NIM: nim_attr, nim_object, nim_pdattr

There are commands, representing an interface to ODM, so you can add, retrieve,
drop and change objects.
The following commands can be used with ODM:

odmadd,
odmdrop,
odmshow,
odmdelete,
odmcreate,
odmchange

Examples:

# odmget -q "type LIKE lv*" PdDv


# odmget -q name=hdisk0 CuAt

Logical devices and physical devices:


-------------------------------------

AIX includes both logical devices and physical devices in the ODM device
configuration database.
Logical devices include Volume Groups, Logical Volumes, network interfaces and so
on.
Physical devices are adapters, modems etc..

Most devices are selfconfiguring devices, only serial devices (modems, printers)
are not selfconfigurable.

The command that configures devices is "cfgmgr", the "configuration manager".


When run, it compares the information from the device with the predefined section
in ODM.
If it finds a match, then it creates the entries in the customized section in ODM.

The configuration manager runs every time the system is restarted.

If you have installed an adapter for example, and you have put the software in a
directory
like /usr/sys/inst.images, you can call cfgmgr to install device drivers as well
with

# cfgmgr -i /usr/sys/inst.images

$$
09-08-00-1,0
u5971-t1-l1-l0

Device information:
-------------------

The most important AIX command to show device info is "lsdev". This command
queries the ODM, so we can use
it to locate the customized or the predifined devices.

The main commands in AIX to get device information are:


- lsdev : queries ODM
- lsattr : gets specific configuration attributes of a device
- lscfg : gets vendor name, serial number, type, model etc.. of the device

lsdev also shows the status of a device as Available (that is configured) or as


Defined (that is predefined).

lsdev examples:
---------------

If you need to see disk or other devices, defined or available, you can use the
lsdev command
as in the following examples:

# lsdev -Cc tape


rmt0 Available 10-60-00-5,0 SCSI 8mm Tape Drive

# lsdev -Cc disk


hdisk0 Available 20-60-00-8,0 16 Bit LVD SCSI Disk Drive
hdisk1 Available 20-60-00-9,0 16 Bit LVD SCSI Disk Drive
hdisk2 Available 20-60-00-10,0 16 Bit LVD SCSI Disk Drive
hdisk3 Available 20-60-00-11,0 16 Bit LVD SCSI Disk Drive
hdisk4 Available 20-60-00-13,0 16 Bit LVD SCSI Disk Drive

Note: -C queries the Customized section of ODM, -P queries the Predefined section
of ODM.

Example if some of the disks are on a SAN (through FC adapters):

# lsdev -Cc disk


hdisk0 Available Virtual SCSI Disk Drive
hdisk1 Available Virtual SCSI Disk Drive
hdisk2 Available 02-08-02 SAN Volume Controller MPIO Device (through FC adapter)
hdisk3 Available 02-08-02 SAN Volume Controller MPIO Device (through FC adapter)

# lsattr -El hdisk2


PCM PCM/friend/sddpcm PCM
True
PR_key_value none Reserve Key
True
algorithm load_balance Algorithm
True
dist_err_pcnt 0 Distributed
Error Percentage True
dist_tw_width 50 Distributed
Error Sample Time True
hcheck_interval 20 Health Check
Interval True
hcheck_mode nonactive Health Check
Mode True
location Location Label
True
lun_id 0x0 Logical Unit
Number ID False
lun_reset_spt yes Support SCSI
LUN reset True
max_transfer 0x40000 Maximum
TRANSFER Size True
node_name 0x50050768010029c8 FC Node Name
False
pvid 00cb5b9e66cc16470000000000000000 Physical
volume identifier False
q_type simple Queuing TYPE
True
qfull_dly 20 delay in
seconds for SCSI TASK SET FULL True
queue_depth 20 Queue DEPTH
True
reserve_policy no_reserve Reserve Policy
True
rw_timeout 60 READ/WRITE
time out value True
scbsy_dly 20 delay in
seconds for SCSI BUSY True
scsi_id 0x611013 SCSI ID
False
start_timeout 180 START unit
time out value True
unique_id 33213600507680190014E30000000000001E204214503IBMfcp Device Unique
Identification False
ww_name 0x50050768014029c8 FC World Wide
Name False

lsdev [ -C ][ -c Class ] [ -s Subclass ] [ -t Type ] [ -f File ] [ -F Format |


-r ColumnName ] [ -h ] [ -H ] [ -l { Name | - } ] [ -p Parent ] [ -S State ]

lsdev -P [ -c Class ] [ -s Subclass ] [ -t Type ] [ -f File ] [ -F Format | -r


ColumnName ] [ -h ] [ -H ]

Remark:

For local attached SCSI devices, the general format of the LOCATION code "AB-CD-
EF-GH" is actually "AB-CD-EF-G,H" ,
the first three sections are the same and for the GH section, the G is de SCSI ID
and the H is the LUN.
For adapters, only the AB-CD is mentioned in the location code.

A location code is a representation of the path to the device, from drawer, slot,
connector and port.

- For an adapter it is sufficient to have the codes of the drawer and slot to
identify
the adapter. The location code of an adapter takes the form of AB-CD.

- Other devices needs more specification, like a specific disk on a specific SCSI
bus.
For other devices the format is AB-CD-EF-GH.
The AB-CD part then indicates the adapter the device is connected on.

- For SCSI devices we have a location code like AB-CD-EF-S,L where the S,L fields
identifies
the SCSI ID and LUN of the device.

To lists all devices in the Predefined object class with column headers, use
# lsdev -P -H

To list the adapters that are in the Available state in the Customized Devices
object class, use
# lsdev -C -c adapter -S

lsattr examples:
----------------

This command gets the current attributes (-E flag) for a tape drive:

# lsattr -El rmt0


mode yes Use DEVICE BUFFERS during writes True
block_size 1024 Block size (0=variable length) True
extfm no Use EXTENDED file marks True
ret no RETENSION on tape change or reset True
..
..

(Ofcourse, the equivalent for the above command is for example # lsattr -l rmt0 -E
)

To list the default values for that tape device (-D flag), use
# lsattr -l -D rmt0

This command gets the attributes for a network adapter:

# lsattr -E -l ent1
busmem 0x3cfec00 Bus memory address False
busintr 7 Bus interrupt level False
..
..

To list only a certain attribute (-a flag), use the command as in the following
example:

# lsattr -l -E scsi0 -a bus_intr_lvl


bus_intr_lvl 14 Bus interrupt level False

# lsattr -El tty0 -a speed


speed 9600 BAUD rate true
You must specify one of the following flags with the lsattr command:
-D Displays default values.
-E Displays effective values (valid only for customized devices specified with
the -l flag).
-F Format Specifies the user-defined format.
-R Displays the range of legal values.
-a Displays for that attribute

lscfg examples:
---------------

Example 1:

This command gets the Vital Product Data for the tape drive rmt0:

# lscfg -vl rmt0


Manufacturer...............EXABYTE
Machine Type and Model.....IBM-20GB
Device Specific(Z1)........38zA
Serial Number..............60089837
..
..

-l Name Displays device information for the named device.

-p Displays the platform-specific device information. This flag only applies to


AIX 4.2.1 or later.

-v Displays the VPD found in the Customized VPD object class. Also, on AIX 4.2.1
or later, displays platform specific VPD when used with the -p flag.

-s Displays the device description on a separate line from the name and
location.

# lscfg -vp | grep -p 'Platform Firmware:'

# lscfg -vp | grep -p Platform

sample output:

Platform Firmware:
ROM Level.(alterable).......3R040602
Version.....................RS6K
System Info Specific.(YL)...U1.18-P1-H2/Y2
Physical Location: U1.18-P1-H2/Y2
The ROM Level denotes the firmware/microcode level
Platform Firmware:
ROM Level ............. RH020930
Version ................RS6K
..

Example 2:

The following command shows details about the Fiber Channel cards:
# lscfg �vl fcs* (fcs0 for example, is the parent of fsci0)

Adding a device:
----------------

Adding a device with cfmgr:


---------------------------

To add a device you can run cfgmgr, or shutdown the system, attach the new device
and boot the system.
There are also many smitty screens to accomplish the task of adding a new device.

Adding a device with mkdev:


---------------------------

Also the mkdev command can be used as in the following example:

# mkdev -c tape -s scsi -t scsd -p scsi0 -w 5,0

where

-c Class of the device


-s Subclass of the device
-t Type of the device. This is a specific attribute for the device
-p The parent adapter of the device. You have to specify the logical name.
-w You have to know the SCSI ID that you are goiing to assign to the new
device.
If it's non SCSI, you have to know the port number on the adapter.
-a Specifies the device attribute-value pair

The mkdev command also creates the ODM entries for the device and loads the device
driver.

The following command configures a new disk and ensures that it is available as a
physical volume.
This example adds a 2.2GB disk with a scsi ID of 6 and a LUN of 0 to the scsi3
SCSI bus.

# mkdev -c disk -s scsi -t 2200mb -p scsi3 -w 6,0 -a pv=yes

This example adds a terminal:

# mkdev -c tty -t tty -s rd232 -p sa1 -w 0 -a login=enable -a term=ibm3151


tty0 Available

Changing a device with chdev:


-----------------------------

Suppose you have just added a new disk. Suppose the cfgmgr has run and detected
the disk.

Now you run


# lspv
hdisk1 none none
OR
hdisk1 0005264d2 none

The first field identifies the system-assigned name of the disk. The second field
displays the
"physical volume id" PVID. If that is not shown, you can use chdev:

# chdev -l hdisk2 -a pv=yes

Removing a device with rmdev:


-----------------------------

Examples:

# lsdev -Cc tape


rmt0 Available 10-60-00-5,0 SCSI 8mm Tape Drive

# rmdev -l rmt0 # -l indicates using the logical device name


rmt0 Defined

The status have shifted from Available to Defined.

# lsdev -Cc tape


rmt0 Defined 10-60-00-5,0 SCSI 8mm Tape Drive

If you really want to remove it from the system, use the -d flag as well

# rmdev -l rmt0 -d

To unconfigure the childeren of PCI bus pci1 and all devices under them, while
retaining their
device definition in the Customized Devices Object Class.

# rmdev -p pci1
rmt0 Defined
hdisk1 Defined
scsi1 Defined
ent0 Defined

The special device sys0:


------------------------

In AIX 5.x we have a special device named sys0 that is used to manage some kernel
parameters.
The way to change these values is by using smitty, the chdev command or WSM.

Example.

To change the maxusersprocesses parameter, you can for example use the Web-based
System Manager.
You can also use the chdev command:

#chdev -l sys0 -a maxuproc=50


sys0 changed

Note: In Solaris, to change kernel parameters, you have to edit /etc/system.

Device drivers:
---------------

Device drivers are located in /usr/lib/drivers directory.

============================
31. filesystem commands AIX:
============================

31.1 The Logical Volume Manager LVM:


====================================

In AIX, it's common to use a Logical Volume Manager LVM to cross the boundaries
posed by
traditional disk management.
Traditionally, a filesystem was on a single disk or on a single partition.
Changing a partionion size was a difficult task. With a LVM, we can create logical
volumes
which can span several disks.

The LVM has been a feature of the AIX operating system since version 3, and it is
installed
automatically with the Operating System.

LVM commands in AIX:


--------------------

mkvg (or the mkvg4vp command in case of SAN vpath disks. See section 31.3)
cplv
rmlv
mklvcopy
extendvg
reducevg
getlvcb
lspv
lslv
lsvg
mirrorvg
chpv
migratepv
exportvg, importvg
varyonvg, varyoffvg

And related commands:


mkdev
chdev
rmdev
lsdev

Volume group:
-------------
What a physical disk is, or a physical volume is, is evident. When you add a
physical volume to a volume group,
the physical volume is partitioned into contiguous equal-sized units of space
called "physical partitions".
A physical partition is the smallest unit of storage space allocation and is a
contiguous space
on a physical volume.
The physical volume must now become part of a volume group. The disk must be in a
available state
and must have a "physical volume id" assigned to it.

A volume group (VG) is an entity consisting of 1 to 32 physical volumes (of


varying sizes and types).
A "Big volume group" kan scale up to 128 devices.

You create a volume group with the "mkvg" command. You add a physical volume to an
existing volume group with
the "extendvg" command, you make use of the changed size of a physical volume with
the "chvg" command,
and remove a physical volume from a volume group with the "reducevg" command.
Some of the other commands that you use on volume groups include:
list (lsvg), remove (exportvg), install (importvg), reorganize (reorgvg),
synchronize (syncvg),
make available for use (varyonvg), and make unavailable for use (varyoffvg).

To create a VG, using local disks, use the "mkvg" command:

mkvg -y <name_of_volume_group> -s <partition_size> <list_of_hard_disks>

Typical example:

mkvg -y oravg -s 64 hdisk3 hdisk4

mkvg -y appsvg -s 32 hdisk2


mkvg -y datavg -s 64 hdisk3

mkvg -y appsvg -s 32 hdisk3


mkvg -y datavg -s 32 hdisk2
mkvg -y vge1corrap01 -s 64 hdisk2

In case you use the socalled SDD subsystem with vpath SAN storage, you should use
the "mkvg4vp" command,
which works similar (same flags) as the mkvg command.

Types of VG's:
==============

There are 3 kinds of VG's:

- Normal VG (AIX 5L)


- Big VG (AIX 5L)
- Scalable VG (as from AIX 5.3)

Normal VG:
----------

Number of disks Max number of partitions/disk


1 32512
2 16256
4 8128
8 4064
16 2032
32 1016

Big VG:
-------
Number of disks Max number of partitions/disk
1 130048
2 65024
4 32512
8 16256
16 8128
32 4064
64 2032
128 1016

VG Type Max PV's Max LV's Max PP's per VG


---------------------------------------------------------------
Normal 32 256 32512
Big 128 512 130048
Scalable 1024 4096 2097152

Physical Partition:
===================

You can change the NUMBER of PPs in a VG, but you cannot change the SIZE of PPs
afterwards.
Defaults:
- 4 MB partition size. It can be a multiple of that amount. The Max size is 1024
MB
- The default is 1016 PPs per disk. You can increase the number of PPs in powers
of 2 per PV, but the number
of maximum disks per VG is decreased.

#disks max # of PPs / disk


32 1016
16 2032
8 4064
4 8128
2 16256
1 32512

In the case of a set of "normal" internal disks of, for example, 30G or 70G or so,
common partition sizes are 64M or 128M.

Logical Partition:
------------------
A LP maps to (at least) one PP, and is actually the smallest unit of allocatable
space.

Logical Volume:
---------------

Consists of LPs in a VG. A LV consists of LPs from actual PPs from one or more
disks.

|-----| | ----|
|LP1 | ---> | PP1 |
|-----| | ----|
|LP2 | ---> | PP2 |
|-----| | ----|
|.. | hdisk 1 (Physical Volume 1)
|.. |
|.. |
|-----| |---- |
|LPn | ---> |PPn |
|-----| |---- |
|LPn+1| ---> |PPn+1|
|-----| |---- |
Logical Volume hdisk2 (Physical Volume 2)

So, a VG is a collection of related PVs, but you know that actually LVs are
created in the VG.
For the applications, the LVs are the entities they work with.
In AIX, a filesystem like "/data", corresponds to a LV.

lspv Command
------------

Purpose: Displays information about a physical volume within a volume group.

lspv [ -L ] [ -l | -p | -M ] [ -n DescriptorPhysicalVolume] [ -v VolumeGroupID]


PhysicalVolume

-p: lists range, state, region, LV names, type and mount points

# lspv
# lspv hdisk3
# lspv -p hdisk3

# lspv
hdisk0 00453267554 rootvg
hdisk1 00465249766 rootvg

# lspv hdisk23
PHYSICAL VOLUME: hdisk23 VOLUME GROUP: oravg
PV IDENTIFIER: 00ccf45d564cfec0 VG IDENTIFIER
00ccf45d00004c0000000104564d2386
PV STATE: active
STALE PARTITIONS: 0 ALLOCATABLE: yes
PP SIZE: 256 megabyte(s) LOGICAL VOLUMES: 3
TOTAL PPs: 947 (242432 megabytes) VG DESCRIPTORS: 1
FREE PPs: 247 (63232 megabytes) HOT SPARE: no
USED PPs: 700 (179200 megabytes)
FREE DISTRIBUTION: 00..00..00..57..190
USED DISTRIBUTION: 190..189..189..132..00

# lspv -p hdisk23
hdisk23:
PP RANGE STATE REGION LV NAME TYPE MOUNT POINT
1-22 used outer edge u01 jfs2 /u01
23-190 used outer edge u02 jfs2 /u02
191-379 used outer middle u01 jfs2 /u01
380-568 used center u01 jfs2 /u01
569-600 used inner middle u02 jfs2 /u02
601-700 used inner middle u03 jfs2 /u03
701-757 free inner middle
758-947 free inner edge

# lspv -p hdisk0
hdisk0:
PP RANGE STATE REGION LV NAME TYPE MOUNT POINT
1-1 used outer edge hd5 boot N/A
2-48 free outer edge
49-51 used outer edge hd9var jfs /var
52-52 used outer edge hd2 jfs /usr
53-108 used outer edge hd6 paging N/A
109-116 used outer middle hd6 paging N/A
117-215 used outer middel hd2 jfs /usr
216-216 used center hd8 jfslog N/A
217-217 used center hd4 jfs /
218-222 used center hd2 jfs /usr
223-320 used center hd4 jfs /
..
..

Note that in this example the Logical Volumes corresponds to the filesystems in
the
following way:
hd4= /, hd5=boot, hd6=paging, hd2=/usr, hd3=/tmp, hd9var=/var

lslv Command
------------
Purpose: Displays information about a logical volume.

To Display Logical Volume Information


lslv [ -L ] [ -l| -m ] [ -nPhysicalVolume ] LogicalVolume

To Display Logical Volume Allocation Map


lslv [ -L ] [ -nPhysicalVolume ] -pPhysicalVolume [ LogicalVolume ]

# lslv -l lv06
lv06:/backups
PV COPIES IN BAND DISTRIBUTION
hdisk3 512:000:000 100% 000:218:218:076:000

# lslv lv06
LOGICAL VOLUME: lv06 VOLUME GROUP: backupvg
LV IDENTIFIER: 00c8132e00004c0000000106ef70cec2.2 PERMISSION: read/write
VG STATE: active/complete LV STATE: opened/syncd
TYPE: jfs WRITE VERIFY: off
MAX LPs: 512 PP SIZE: 64 megabyte(s)
COPIES: 1 SCHED POLICY: parallel
LPs: 512 PPs: 512
STALE PPs: 0 BB POLICY: relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: middle UPPER BOUND: 32
MOUNT POINT: /backups LABEL: /backups
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
Serialize IO ?: NO

# lslv -p hdisk3
FREE FREE FREE FREE FREE FREE FREE FREE FREE FREE 1-10
FREE FREE FREE FREE FREE FREE FREE FREE FREE FREE 11-20
FREE FREE FREE FREE FREE FREE FREE FREE FREE FREE 21-30
FREE FREE FREE FREE FREE FREE FREE FREE FREE FREE 31-40
FREE FREE FREE FREE FREE FREE FREE FREE FREE FREE 41-50
FREE FREE FREE FREE FREE FREE FREE FREE FREE FREE 51-60
FREE FREE FREE FREE FREE FREE FREE FREE FREE FREE 61-70
FREE FREE FREE FREE FREE FREE FREE FREE FREE FREE 71-80
FREE FREE FREE FREE FREE FREE FREE FREE FREE FREE 81-90
..
..

Also, you can list LVs per VG by running, for example:

# lsvg -l backupvg
backupvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
loglv02 jfslog 1 1 1 open/syncd N/A
lv06 jfs 512 512 1 open/syncd /backups

# lsvg -l splvg
splvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
loglv01 jfslog 1 1 1 open/syncd N/A
lv04 jfs 240 240 1 open/syncd /data
lv00 jfs 384 384 1 open/syncd /spl
lv07 jfs 256 256 1 open/syncd /apps

For a complete storage system, this could yield in for example:

-redovg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
redo1lv jfs2 42 42 3 open/syncd /u05
redo2lv jfs2 1401 1401 3 open/syncd /u04
loglv03 jfs2log 1 1 1 open/syncd N/A
-db2vg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
db2lv jfs2 600 600 2 open/syncd /db2_database
loglv00 jfs2log 1 1 1 open/syncd N/A
-oravg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
u01 jfs2 800 800 2 open/syncd /u01
u02 jfs2 400 400 2 open/syncd /u02
u03 jfs2 200 200 2 open/syncd /u03
logfs jfs2log 2 2 1 open/syncd N/A
-rootvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
hd5 boot 1 2 2 closed/syncd N/A
hd6 paging 36 72 2 open/syncd N/A
hd8 jfs2log 1 2 2 open/syncd N/A
hd4 jfs2 8 16 3 open/syncd /
hd2 jfs2 24 48 2 open/syncd /usr
hd9var jfs2 9 18 3 open/syncd /var
hd3 jfs2 11 22 3 open/syncd /tmp
hd1 jfs2 10 20 2 open/syncd /home
hd10opt jfs2 2 4 2 open/syncd /opt
fslv00 jfs2 1 2 2 open/syncd /XmRec
fslv01 jfs2 2 4 3 open/syncd /tmp/m2
paging00 paging 32 32 1 open/syncd N/A
sysdump1 sysdump 80 80 1 open/syncd N/A
oralv jfs2 100 100 1 open/syncd /opt/app/oracle
fslv03 jfs2 63 63 2 open/syncd /bmc_home

And you can list the LVs by PV by running


# lspv -l hdiskn

lsvg Command:
-------------

-o Shows only the active volume groups.


-p VG_name Shows all the PVs that belong to the vg_name
-l VG_name Shows all the LVs that belong to the vg_name

Examples:

# lsvg
rootvg
informixvg
oravg

# lsvg -o
rootvg
oravg

# lsvg oravg
VOLUME GROUP: oravg VG IDENTIFIER:
00ccf45d00004c0000000104564d2386
VG STATE: active PP SIZE: 256 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 1894 (484864 megabytes)
MAX LVs: 256 FREE PPs: 492 (125952 megabytes)
LVs: 4 USED PPs: 1402 (358912 megabytes)
OPEN LVs: 4 QUORUM: 2
TOTAL PVs: 2 VG DESCRIPTORS: 3
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 2 AUTO ON: yes
MAX PPs per PV: 1016 MAX PVs: 32
LTG size: 128 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable

# lsvg -p informixvg
informixvg
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk3 active 542 462 109..28..108..108..109
hdisk4 active 542 447 109..13..108..108..109

# lsvg -l rootvg
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
hd5 boot 1 1 1 closed/syncd N/A
hd6 paging 24 24 1 open/syncd N/A
hd8 jfslog 1 1 1 open/syncd N/A
hd4 jfs 4 4 1 open/synced /
hd2 jfs 76 76 1 open/synced /usr
hd9var jfs 4 4 1 open/synced /var
hd3 jfs 6 6 1 open/synced /tmp
paging00 paging 20 20 1 open/synced N/A
..
..

Suppose we have 70GB disk=70000MB


1016 partitions=> 63 MB per PP

extendvg command:
-----------------

extendvg VGName hdiskNumber

# extendvg newvg hdisk23

How to Add a Disk to a Volume Group?

extendvg VolumeGroupName hdisk0 hdisk1 ... hdiskn

reducevg command:
-----------------

To remove a PV from a VG:

# reducevg myvg hdisk23

To remove a VG:

Suppose we have a VG informixvg with 2 PV, hdisk3 and hdisk4:

# reducevg -d informixvg hdisk4

When you delete the last disk from the VG, the VG is also removed.
# reducevg -d informix hdisk3

varyonvg and varyoffvg commands:


--------------------------------

When you activate a VG for use, all its resident filesystems are mounted by
default if they have
the flag mount=true in the /etc/filesystems file.

# varyonvg apachevg

# varyoffvg apachevg

To use this command, you must be sure that none of the logical volumes are opened,
that is, in use.

mkvg command:
-------------

You can create a new VG by using "smitty mkvg" or by using the mkvg command.

Use the following command, where s "partition_size" sets the number of megabytes
in each physical partition
where the partition_size is expressed in units of megabytes from 1 through 1024.
The size variable must
be equal to a power of 2 (for example 1, 2, 4, 8). The default value is 4.

mkvg -y <name_of_volume_group> -s <partition_size> <list_of_hard_disks>

As with physical volumes, volume groups can be created and removed and their
characteristics
can be modified.

Before a new volume group can be added to the system, one or more physical volumes
not used
in other volume groups, and in an available state, must exist on the system.

The following example shows the use of the mkvg command to create a volume group
myvg
using the physical volumes hdisk1 and hdisk5.

# mkvg -y myvg -d 10 -s 8 hdisk1 hdisk5

# mkvg -y oravg -d 10 -s 64 hdisk1

mklv command:
-------------

To create a LV, you can use the smitty command "smitty mklv" or just use the mklv
command
by itself.

The mklv command creates a new logical volume within the VolumeGroup. For example,
all file systems
must be on separate logical volumes. The mklv command allocates the number of
logical partitions
to the new logical volume. If you specify one or more physical volumes with the
PhysicalVolume parameter,
only those physical volumes are available for allocating physical partitions;
otherwise, all the
physical volumes within the volume group are available.

The default settings provide the most commonly used characteristics, but use flags
to tailor the logical volume
to the requirements of your system. Once a logical volume is created, its
characteristics can be changed
with the chlv command.

When you create a LV, you also specify the number of LP�s, and how a LP maps to
PP�s.
Later, you can create one filesystem per LV.

Examples

The following example creates a LV "lv05" on the VG "splvg", with two copies (2
PPs) of each LP.
In this case, we are mirroring a LP to two PP's.
Also, 200 PP's are specified. If a PP is 128 MB is size, the total amount of space
of one "mirror" is 25600 MB.

# mklv -y lv05 -c 2 splvg 200

The following example shows the use of mklv command to create a new LV newlv in
the rootvg
and it will have 10 LP�s and each LP consists of 2 physical partitions.

# mklv -y newlv -c 2 rootvg 10

To make a logical volume in volume group vg02 with one logical partition and a
total of two copies of the data, enter:

# mklv -c 2 vg02 1

To make a logical volume in volume group vg03 with nine logical partitions and a
total of three copies
spread across a maximum of two physical volumes, and whose allocation policy is
not strict, enter:

# mklv -c 3 -u 2 -s n vg03 9

To make a logical volume in vg04 with five logical partitions allocated across the
center sections of the
physical volumes when possible, with no bad-block relocation, and whose type is
paging, enter:

# mklv -a c -t paging -b n vg04 5

To make a logical volume in vg03 with 15 logical partitions chosen from physical
volumes hdisk5, hdisk6, and hdisk9,
enter:
# mklv vg03 15 hdisk5 hdisk6 hdisk9

To make a striped logical volume in vg05 with a stripe size of 64K across 3
physical volumes and 12
logical partitions, enter:

# mklv -u 3 -S 64K vg05 12

To make a striped logical volume in vg05 with a stripe size of 8K across hdisk1,
hdisk2, and hdisk3 and
12 logical partitions, enter:

# mklv -S 8K vg05 12 hdisk1 hdisk2 hdisk3

The following example uses a "map file /tmp/mymap1" which list which PPs are to be
used in creating a LV:

# mklv -t jfs -y lv06 -m /tmp/mymap1 rootvg 10

The setting Strict=y means that each copy of the LP is placed on a different PV.
The setting Strict=n means
that copies are not restricted to different PVs.
The default is strict.

# mklv -y lv13 -c 2 failovervg 150


# crfs -v jfs -d lv13 -m /backups2 -a bf=true

Another simple example using local disks:

# mkvg -y appsvg -s 32 hdisk2


# mkvg -y datavg -s 32 hdisk3

# mklv -y testlv -c 1 appsvg 10


# mklv -y backuplv -c 1 datavg 10

# crfs -v jfs -d testlv -m /test -a bf=true


# crfs -v jfs -d backuplv -m /backup -a bf=true

mklv -y testlv1 -c 1 appsvg 10


mklv -y testlv2 -c 1 datavg 10
crfs -v jfs -d testlv1 -m /test1 -a bf=true
crfs -v jfs -d testlv2 -m /test2 -a bf=true

mklv -y testlv1 -c 1 vgp0corddap01 10


mklv -y testlv2 -c 1 vgp0corddad01 10
crfs -v jfs -d testlv1 -m /test1 -a bf=true
crfs -v jfs -d testlv2 -m /test2 -a bf=true

rmlv command:
-------------

# rmlv newlv
Warning, all data on logical volume newlv will be destroyed.
rmlv: Do you wish to continue? y(es) n(o) y
#
extendlv command:
-----------------

The following example shows the use of the extentlv command to add 3 more LP's to
the LP newlv:

# extendlv newlv 3

cplv command:
-------------

The following command copies the contents of LV oldlv to a new LV called newlv:
# cplv -v myvg -y newlv oldlv

To copy to an existing LV:


# cplv -e existinglv oldlv

Purpose
Copies the contents of a logical volume to a new logical volume.

Syntax
To Copy to a New Logical Volume

cplv [ -vg VolumeGroup ] [ -lv NewLogicalVolume | -prefix Prefix ]


SourceLogicalVolume

To Copy to an Existing Logical Volume

cplv [ -f ] SourceLogicalVolume DestinationLogicalVolume

cplv -e DestinationLogicalVolume [-f] SourceLogicalVolume

-e: specifies that the DestinationLogicalVolume already exists.


-f: no user confirmation
-y: specifies the name to use for the NewLogicalVolume, instead of a system
generated name.

Description
Attention: Do not copy from a larger logical volume containing data to a smaller
one. Doing so results
in a corrupted file system because some data is not copied.
The cplv command copies the contents of SourceLogicalVolume to a new or existing
logical volume.
The SourceLogicalVolume parameter can be a logical volume name or a logical volume
ID.
The cplv command creates a new logical volume with a system-generated name by
using the default syntax.
The system-generated name is displayed.

Note:
The cplv command can not copy logical volumes which are in the open state,
including logical volumes
that are being used as backing devices for virtual storage.
Flags
-f Copies to an existing logical volume without requesting user confirmation.
-lv NewLogicalVolume Specifies the name to use, in place of a system-generated
name,
for the new logical volume. Logical volume names must be unique systemwide names,
and can range
from 1 to 15 characters.
-prefix Prefix Specifies a prefix to use in building a system-generated name for
the new logical volume.
The prefix must be less than or equal to 13 characters. A name cannot be a name
already used by another device.
-vg VolumeGroup Specifies the volume group where the new logical volume resides.
If this is not specified,
the new logical volume resides in the same volume group as the
SourceLogicalVolume.

Examples
To copy the contents of logical volume fslv03 to a new logical volume, type:

# cplv fslv03
The new logical volume is created, placed in the same volume group as fslv03,
and named by the system.

To copy the contents of logical volume fslv03 to a new logical volume in volume
group vg02,
type:
#cplv -vg vg02 fslv03
The new logical volume is created, named, and added to volume group vg02.

#To copy the contents of logical volume lv02 to a smaller, existing logical
volume,
lvtest, without requiring user confirmation, type:
cplv -f lv02 lvtest

Errors:
-------

0516-746 cplv: Destination logical volume must have


type set to copy

chlv -t copy lvprj

==========================================================================
CASES of usage of cplv command:

CASE 1:
-------

TITLE : Procedure for moving a filesystem between disks that are in


different volume groups using the cplv command.
OS LEVEL : AIX 4.x
DATE : 25/11/99
VERSION : 1.0

----------------------------------------------------------------------------

In the following example, an RS6000 has 1 one disk with rootvg on, and has
just had a second disk installed. The second disk needs a volume group
creating on it and a data filesystem transferring to the new disk. Ensure
that you have a full system backup befor you start.
lspv

hdisk0 00009922faf79f0d rootvg


hdisk1 None None

df -k

Filesystem 1024-blocks Free %Used Iused %Iused Mounted on


/dev/hd4 8192 1228 86% 1647 41% /
/dev/hd2 380928 40984 90% 11014 12% /usr
/dev/hd9var 32768 20952 37% 236 3% /var
/dev/hd3 28672 1644 95% 166 3% /tmp
/dev/hd1 53248 51284 4% 95 1% /home
/dev/lv00 200704 110324 46% 1869 4% /home/john
/dev/ftplv 102400 94528 8% 32 1% /home/ftp
/dev/lv01 114688 58240 50% 59 1% /usr2

In this example the /usr2 filesystem needs to be moved to the new disk
drive, freeing up space in the root volume group.

1, Create a data volume group on the new disk (hdisk1), the command below
will create a volume group called datavg on hdisk1 with a PP size of
32 Meg:-

mkvg -s 32 -y datavg hdisk1

2, Create a jfslog logical volume on the new volume group :-

mklv -y datalog -t jfslog datavg 1

3, Initialise the jfslog :-

logform /dev/datalog

logform: destroy /dev/datalog (y)?y

4, Umount the filesystem that is being copied :-

umount /usr2

5, Copy the /usr2 logical volume (lv01) to a new logical volume (lv11) on
the new volume group :-

cplv -y lv11 -v datavg lv01

cplv: Logical volume lv01 successfully copied to lv11 .

6, Change the /usr2 filesystem to use the new (/dev/lv11) logical volume
and not the old (/dev/lv01) logical volume :-

chfs -a dev=/dev/lv11 /usr2

7, Change the /usr2 filesystem to use the jfslog on the new volume group
(/dev/datalog) :-
chfs -a log=/dev/datalog /usr2

8, Mount the filesystem :-

mount /usr2

df -k

Filesystem 1024-blocks Free %Used Iused %Iused Mounted on


/dev/hd4 8192 1220 86% 1649 41% /
/dev/hd2 380928 40984 90% 11014 12% /usr
/dev/hd9var 32768 20952 37% 236 3% /var
/dev/hd3 28672 1644 95% 166 3% /tmp
/dev/hd1 53248 51284 4% 95 1% /home
/dev/lv00 200704 110324 46% 1869 4% /home/john
/dev/ftplv 102400 94528 8% 32 1% /home/ftp
/dev/lv11 114688 58240 50% 59 1% /usr2

9, Once the filesystem has been checked out, the old logical volume can
be removed :-

rmfs /dev/lv01

Warning, all data contained on logical volume lv01 will be destroyed.


rmlv: Do you wish to continue? y(es) n(o)? y
rmlv: Logical volume lv01 is removed.

If you wish to copy further filesystems repeat parts 4 to 9.

==========================================================================

CASE 2:
-------

Doel:
-----

Een "move" van het /prj filesystem (met Websphere in /prj/was) op rootvg,
naar een nieuw (groter en beter) volume group "wasvg".
Het huidige /prj op rootvg, correspondeerd met de LV "prjlv".
De nieuw te maken /prj op wasvg, correspondeerd met de LV "lvprj".

ROOTVG WASVG
-------------- --------------
|/usr (hd2) | | |
|.. | | |
|/prj (prjlv)|----------->|/prj (lvprj) |
|.. | | |
-------------- -------------
hdisk0,hdisk1 hdisk12,hdisk13

opm: /prj bevat "/prj/was", en dat is Websphere.

Hier maken we geen gebruik van een backup tape.

Gebruik het cplv command


umount /prj
chfs -m /prj_old /prj

+ mkvg -y wasvg -d 10 -s 128 hdisk12 hdisk13 -- maak VG aan

+ mklv -y lvprj -c 2 wasvg 400 -- maak LV aan

+ mklv -y waslog -t jfslog wasvg 1 -- maak een jfslog

+ logform /dev/waslog -- init de log

cplv -e lvprj prjlv

chfs -a dev=/dev/lvprj /prj_old --

chfs -a log=/dev/waslog /prj_old

chfs -m /prj /prj_old

mount /prj

==========================================================================

migratepv command:
------------------

Use the following command to move PPs from hdisk1 to hdisk6 and hdisk7 (all PVs
must be in 1 VG)
# migratepv hdisk1 hdisk6 hdisk7

Use the following command to move PPs in LV lv02 from hdisk1 to hdisk6
# migratepv -l lv02 hdisk1 hdisk6

chvg command:
-------------

This example multiplies by 2 the number of PPs:


# chvg -t2 datavg

chpv command:
-------------

The chpv command changes the state of the physical volume in a volume group by
setting allocation
permission to either allow or not allow allocation and by setting the availability
to either
available or removed. This command can also be used to clear the boot record for
the given physical volume.
Characteristics for a physical volume remain in effect unless explicitly changed
with the corresponding flag.

Examples

To close physical volume hdisk03, enter:


# chpv -v r hdisk03

The physical volume is closed to logical input and output until the -v a flag is
used.

To open physical volume hdisk03, enter:


# chpv -v a hdisk03

The physical volume is now open for logical input and output.

To stop the allocation of physical partitions to physical volume hdisk03, enter:


# chpv -a n hdisk03

No physical partitions can be allocated until the -a y flag is used.

To clear the boot record of a physical volume hdisk3, enter:


# chpv -c hdisk3

How to synchronize stale partitions in a VG?:


---------------------------------------------

the syncvg command:

syncvg Command

Purpose
Synchronizes logical volume copies that are not current.

Syntax
syncvg [ -f ] [ -i ] [ -H ] [ -P NumParallelLps ] { -l | -p | -v } Name ...

Description
The syncvg command synchronizes the physical partitions, which are copies of the
original physical partition,
that are not current. The syncvg command can be used with logical volumes,
physical volumes,
or volume groups, with the Name parameter representing the logical volume name,
physical volume name,
or volume group name. The synchronization process can be time consuming, depending
on the
hardware characteristics and the amount of data.

When the -f flag is used, a good physical copy is chosen and propagated to all
other copies
of the logical partition, whether or not they are stale. Using this flag is
necessary
in cases where the logical volume does not have the mirror write consistency
recovery.

Unless disabled, the copies within a volume group are synchronized automatically
when the volume group is
activated by the varyonvg command.

Note:
For the sycnvg command to be successful, at least one good copy of the logical
volume should
be accessible, and the physical volumes that contains this copy should be in
ACTIVE state.
If the -f option is used, the above condition applies to all mirror copies.
If the -P option is not specified, syncvg will check for the NUM_PARALLEL_LPS
environment variable.
The value of NUM_PARALLEL_LPS will be used to set the number of logical partitions
to be synchronized in parallel.

Examples
To synchronize the copies on physical volumes hdisk04 and hdisk05, enter:
# syncvg -p hdisk04 hdisk05

To synchronize the copies on volume groups vg04 and vg05, enter:


# syncvg -v vg04 vg05

How to Mirror a Logical Volume? :


--------------------------------

mklvcopy LogicalVolumeName Numberofcopies


syncvg VolumeGroupName

To add a copy for LV lv01 on disk hdisk7:

# mklvcopy lv01 2 hdisk7

Identifying hotspots: lvmstat command:


--------------------------------------

The lvmstat command display statistics values since the previous lvmstat command.
# lvmstat -v rootvg -e
# lvmstat -v rootvg -C
# lvmstat -v rootvg

Logical Volume iocnt KB_read KB_wrtn Kbps


hd8 4 0 0 0.00
paging01 0 0 0 0.00
..
..

31.2 Mirroring a VG:


====================

LVM provide a disk mirroring facility at the LV level.


Mirroring is the association of 2 or 3 PP's with each LP in a LV.

Use the "mklv", or the "mklvcopy", or the "mirrorvg" command.

The mklv command allows you to select one or two additional copies for each
logical volume.

example:
To make a logical volume in volume group vg03 with nine logical partitions and a
total of three copies
spread across a maximum of two physical volumes, and whose allocation policy is
not strict, enter:

mklv -c 3 -u 2 -s n vg03 9

Mirroring can also be added to an existing LV using the mklvcopy command.

The mirrorvg command mirrors all the LV's on a given VG.


Examples:

- To triply mirror a VG, run


# mirrorvg -c 3 myvg

- To get default mirroring of the rootvg, run


# mirrorvg rootvg

- To replace a failed disk in a mirrored VG, run


# unmirrorvg workvg hdisk7
# reducevg workvg hdisk7
# rmdev -l hdisk7 -d

Now replace the failed disk with a new one and name it hdisk7
# extendvg workvg hdisk7
# mirrorvg workvg

mirrorvg command:
-----------------

mirrorvg Command

Purpose
Mirrors all the logical volumes that exist on a given volume group.
This command only applies to AIX 4.2.1 or later.

Syntax
mirrorvg [ -S | -s ] [ -Q ] [ -c Copies] [ -m ] VolumeGroup [ PhysicalVolume ... ]

Description
The mirrorvg command takes all the logical volumes on a given volume group and
mirrors
those logical volumes. This same functionality may also be accomplished manually
if you execute
the mklvcopy command for each individual logical volume in a volume group. As with
mklvcopy,
the target physical drives to be mirrored with data must already be members of the
volume group.
To add disks to a volume group, run the extendvg command.

By default, mirrorvg attempts to mirror the logical volumes onto any of the disks
in a volume group.
If you wish to control which drives are used for mirroring, you must include the
list of disks in the
input parameters, PhysicalVolume. Mirror strictness is enforced. Additionally,
mirrorvg mirrors
the logical volumes, using the default settings of the logical volume being
mirrored.
If you wish to violate mirror strictness or affect the policy by which the mirror
is created,
you must execute the mirroring of all logical volumes manually with the mklvcopy
command.

When mirrorvg is executed, the default behavior of the command requires that the
synchronization
of the mirrors must complete before the command returns to the user. If you wish
to avoid the delay,
use the -S or -s option. Additionally, the default value of 2 copies is always
used. To specify a value
other than 2, use the -c option.

Note: To use this command, you must either have root user authority or be a member
of the system group.

Attention: The mirrorvg command may take a significant amount of time before
completing because
of complex error checking, the amount of logical volumes to mirror in a volume
group, and the time
is takes to synchronize the new mirrored logical volumes.
You can use the Volumes application in Web-based System Manager (wsm) to change
volume characteristics.
You could also use the System Management Interface Tool (SMIT) smit mirrorvg fast
path to run this command.

Flags

-c Copies Specifies the minimum number of copies that each logical volume must
have after
the mirrorvg command has finished executing. It may be possible, through the
independent use
of mklvcopy, that some logical volumes may have more than the minimum number
specified after
the mirrorvg command has executed. Minimum value is 2 and 3 is the maximum
value.
A value of 1 is ignored.
-m exact map Allows mirroring of logical volumes in the exact physical partition
order that
the original copy is ordered. This option requires you to specify a
PhysicalVolume(s) where the exact map
copy should be placed. If the space is insufficient for an exact mapping, then
the command will fail.
You should add new drives or pick a different set of drives that will satisfy
an exact
logical volume mapping of the entire volume group. The designated disks must be
equal to or exceed
the size of the drives which are to be exactly mirrored, regardless of if the
entire disk is used.
Also, if any logical volume to be mirrored is already mirrored, this command
will fail.
-Q Quorum Keep By default in mirrorvg, when a volume group's contents becomes
mirrored, volume group
quorum is disabled. If the user wishes to keep the volume group quorum
requirement after mirroring
is complete, this option should be used in the command. For later quorum
changes, refer to the chvg command.
-S Background Sync Returns the mirrorvg command immediately and starts a
background syncvg of the volume group.
With this option, it is not obvious when the mirrors have completely finished
their synchronization.
However, as portions of the mirrors become synchronized, they are immediately
used by the operating system
in mirror usage.
-s Disable Sync Returns the mirrorvg command immediately without performing any
type of
mirror synchronization. If this option is used, the mirror may exist for a
logical volume but
is not used by the operating system until it has been synchronized with the
syncvg command.

The following is a description of rootvg:

- rootvg mirroring When the rootvg mirroring has completed, you must perform
three additional tasks:
bosboot, bootlist, and reboot.
The bosboot command is required to customize the bootrec of the newly mirrored
drive.
The bootlist command needs to be performed to instruct the system which disk and
order you prefer
the mirrored boot process to start.

Finally, the default of this command is for Quorum to be turned off. For this to
take effect
on a rootvg volume group, the system must be rebooted.

- non-rootvg mirroring When this volume group has been mirrored, the default
command causes Quorum
to deactivated. The user must close all open logical volumes, execute varyoffvg
and then varyonvg on
the volume group for the system to understand that quorum is or is not needed for
the volume group.
If you do not revaryon the volume group, mirror will still work correctly.
However, any quorum changes
will not have taken effect.
rootvg and non-rootvg mirroring The system dump devices, primary and secondary,
should not be mirrored.
In some systems, the paging device and the dump device are the same device.
However, most users want
the paging device mirrored. When mirrorvg detects that a dump device and the
paging device are the same,
the logical volume will be mirrored automatically.
If mirrorvg detects that the dump and paging device are different logical volumes,
the paging device
is automatically mirrored, but the dump logical volume is not. The dump device can
be queried and modified
with the sysdumpdev command.
Remark:
-------
Run bosboot to initialize all boot records and devices by executing the
following command:
bosboot -a -d /dev/hdisk?
hdisk? is the first hdisk listed under the PV heading after the command
lslv -l hd5 has executed.

Secondary, you need to understant that the mirroring under AIX it's at
the logical volume level. The mirrorvg command is a hight level command
that use "mklvcopy" command.
So, all LV created before runing the mirrorvg command are keep
synchronised, but if you add a new LV after runing mirrorvg, you need to
mirror it manualy using "mklvcopy" .

Remark:
-------

lresynclv

Mirroring the rootvg:


---------------------

Method 1:
---------

Howto mirror an AIX rootvg


The following steps will guide you trough the mirroring of an AIX rootvg.
This info is valid for AIX 4.3.3, AIX 5.1, AIX 5.2 and AIX 5.3.

Make sure you have an empty disk, in this example its hdisk1
Add the disk to the vg via

# extendvg rootvg hdisk1

Mirror the vg via:

# mirrorvg -s rootvg

Now synchronize the new copies you created:

# syncvg -v rootvg

As we want to be able to boot from different disks, we need to use bosboot:

# bosboot -a

As hd5 is mirrored there is no need to do it for each disk.

Now, update the bootlist:

# bootlist -m normal hdisk1 hdisk0


# bootlist -m service hdisk1 hdisk0
When mirrorvg is executed, the default behavior of the command requires that the
synchronization of the mirrors
must complete before the command returns to the user. If you wish to avoid the
delay, use the -S or -s option.
Additionally, the default value of 2 copies is always used. To specify a value
other than 2, use the -c option.

Method 2:
---------

-------------------------------------------------------------------------------
# Add the new disk, say its hdisk5, to rootvg

extendvg rootvg hdisk5

# If you use one mirror disk, be sure that a quorum is not required for varyon:

chvg -Qn rootvg

# Add the mirrors for all rootvg LV's:

mklvcopy hd1 2 hdisk5


mklvcopy hd2 2 hdisk5
mklvcopy hd3 2 hdisk5
mklvcopy hd4 2 hdisk5
mklvcopy hd5 2 hdisk5
mklvcopy hd6 2 hdisk5
mklvcopy hd8 2 hdisk5
mklvcopy hd9var 2 hdisk5
mklvcopy hd10opt 2 hdisk5
mklvcopy prjlv 2 hdisk5

#If you have other LV's in your rootvg, be sure to create copies for them as
well !!
------------------------------------------------------------------------------

# lspv -l hdisk0
hd5 1 1 01..00..00..00..00 N/A
prjlv 256 256 108..44..38..50..16 /prj
hd6 59 59 00..59..00..00..00 N/A
fwdump 5 5 00..05..00..00..00 /var/adm/ras/platform
hd8 1 1 00..00..01..00..00 N/A
hd4 26 26 00..00..02..24..00 /
hd2 45 45 00..00..37..08..00 /usr
hd9var 10 10 00..00..02..08..00 /var
hd3 22 22 00..00..04..10..08 /tmp
hd1 8 8 00..00..08..00..00 /home
hd10opt 24 24 00..00..16..08..00 /opt

Method 3:
---------

In the following example, an RS6000 has 3 disks, 2 of which have the AIX
filesystems mirrored on. The boolist contains both hdisk0 and hdisk1.
There are no other logical volumes in rootvg other than the AIX system
logical volumes. hdisk0 has failed and need replacing, both hdisk0 and hdisk1
are in "Hot Swap" carriers and therefore the machine does not need shutting
down.

lspv

hdisk0 00522d5f22e3b29d rootvg


hdisk1 00522d5f90e66fd2 rootvg
hdisk2 00522df586d454c3 datavg

lsvg -l rootvg

rootvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
hd6 paging 4 8 2 open/syncd N/A
hd5 boot 1 2 2 closed/syncd N/A
hd8 jfslog 1 2 2 open/syncd N/A
hd4 jfs 1 2 2 open/syncd /
hd2 jfs 12 24 2 open/syncd /usr
hd9var jfs 1 2 2 open/syncd /var
hd3 jfs 2 4 2 open/syncd /tmp
hd1 jfs 1 2 2 open/syncd /home

1, Reduce the logical volume copies from both disks to hdisk1 only :-

rmlvcopy hd6 1 hdisk0


rmlvcopy hd5 1 hdisk0
rmlvcopy hd8 1 hdisk0
rmlvcopy hd4 1 hdisk0
rmlvcopy hd2 1 hdisk0
rmlvcopy hd9var 1 hdisk0
rmlvcopy hd3 1 hdisk0
rmlvcopy hd1 1 hdisk0

2, Check that no logical volumes are left on hdisk0 :-

lspv -p hdisk0

hdisk0:
PP RANGE STATE REGION LV ID TYPE MOUNT POINT
1-101 free outer edge
102-201 free outer middle
202-301 free center
302-401 free inner middle
402-501 free inner edge

3, Remove the volume group from hdisk0

reducevg -df rootvg hdisk0

4, Recreate the boot logical volume on hdisk1, and reset bootlist:-

bosboot -a -d /dev/hdisk1
bootlist -m normal rmt0 cd0 hdisk1

5, Check that everything has been removed from hdisk0 :-


lspv

hdisk0 00522d5f22e3b29d None


hdisk1 00522d5f90e66fd2 rootvg
hdisk2 00522df586d454c3 datavg

6, Delete hdisk0 :-

rmdev -l hdisk0 -d

7, Remove the failed hard drive and replace with a new hard drive.

8, Configure the new disk drive :-

cfgmgr

9, Check new hard drive is present :-

lspv

10, Include the new hdisk in root volume group :-

extendvg rootvg hdisk? (where hdisk? is the new hard disk)

11, Re-create the mirror :-

mirrorvg rootvg hdisk? (where hdisk? is the new hard disk)

12, Syncronise the mirror :-

syncvg -v rootvg

13, Reset the bootlist :-

bootlist -m normal rmt0 cd0 hdisk0 hdisk1

14, Turn off Quorum checking on rootvg :-

chvg -Q n rootvg

Method 4:
---------

Howto mirror an AIX rootvg


The following steps will guide you trough the mirroring of an AIX rootvg.
This info is valid for AIX 4.3.3, AIX 5.1, AIX 5.2 and AIX 5.3.

Make sure you have an empty disk, in this example its hdisk1
Add the disk to the vg via "extendvg rootvg hdisk1
Mirror the vg via: "mirrorvg rootvg"
Adapt the bootlist to add the current disk, the system will then fail to hdisk1 is
hdisk0 fails during startup
do bootlist -o -m normal
this will list currently 1 disk, in this exmaple hdisk0
do bootlist -m normal hdisk0 hdisk1
Run a bosboot on both new disks, this will install all software needed for boot on
the disk
bosboot -ad hdisk0
bosboot -ad hdisk1

Method 5:
---------

Although the steps to mirror volume groups between HP and AIX are incredibly
similar,
there are enough differences to send me through hoops if/when I ever have to do
that.
Therefore, the following checklist:

1. Mirror the logical volumes:


If you don't care what disks the lvs get mirrored to, execute

mirrorvg rootvg

Otherwise:

for lv in $(lsvg -l rootvg | grep -i open/syncd | \


grep -v dumplv | awk '{print $1}')
do
mklvcopy ${lv} 1 ${disk}
done

2. Change the quorum checking if you did not use mirrorvg:

chvg -Q n rootvg

3. Run bosboot on the new drive to copy boot files to it:

bosboot ${disk}

4. Update the bootlist with the new drive:

bootlist -m normal hdisk0 hdisk1

5. Reboot the system to enable the new quorum checking parameter

Method 6:
---------

Audience: System Administrators


Date: September 25, 2002

Mirroring "rootvg" protects the operating system from a disk failure. Mirroring
"rootvg"
requires a couple extra steps compared to other volume groups. The mirrored rootvg
disk must be bootable
*and* in the bootlist. Otherwise, if the primary disk fails, you'll continue to
run,
but you won't be able to reboot.

In brief, the procedure to mirror rootvg on hdisk0 to hdisk1 is

1. Add hdisk1 to rootvg:


extendvg rootvg hdisk1

2. Mirror rootvg to hdisk1:


mirrorvg rootvg hdisk1 (or smitty mirrorvg)

3. Create boot images on hdisk1:


bosboot -ad /dev/hdisk1

4. Add hdisk1 to the bootlist:


bootlist -m normal hdisk0 hdisk1

5. Reboot to disable quorum checking on rootvg. The mirrorvg turns off quorum by
default,
but the system needs to be rebooted for it to take effect.

For more information, and a comprehensive procedure see the man page for mirrorvg
and

Example using mklvcopy:


-----------------------

mklvcopy [ -a Position ] [ -e Range ] [ -k ] [ -m MapFile ] [ -s Strict ] [ -u


UpperBound ] LogicalVolume
Copies [ PhysicalVolume... ]

Add a copy of LV "lv01" on disk hdisk7:

# mklvcopy lv01 2 hdisk7

The mklvcopy command increases the number of copies in each logical partition in
LogicalVolume.
This is accomplished by increasing the total number of physical partitions for
each logical partition
to the number represented by Copies. The LogicalVolume parameter can be a logical
volume name or
logical volume ID. You can request that the physical partitions for the new copies
be allocated
on specific physical volumes (within the volume group) with the PhysicalVolume
parameter;
otherwise, all the physical volumes within the volume group are available for
allocation.

The logical volume modified with this command uses the Copies parameter as its new
copy characteristic.
The data in the new copies are not synchronized until one of the following occurs:

the -k option is used, the volume group is activated by the varyonvg command, or
the volume group
or logical volume is synchronized explicitly by the syncvg command. Individual
logical partitions
are always updated as they are written to.

The default allocation policy is to use minimum numbering of physical volumes per
logical volume copy,
to place the physical partitions belong to a copy as contiguously as possible, and
then to place
the physical partitions in the desired region specified by the -a flag. Also, by
default, each copy
of a logical partition is placed on a separate physical volume.

Using smitty:
-------------

# smit mklv

or

# smit mklvcopy

Using "smit mklv" you can create a new LV and at the same time tell the system to
create a mirror
(2 or 3 copies) of each LP and which PV's are involved.

Using "smit mklvcopy" you can add mirrors to an existing LV.

31.3 Filesystems in AIX:


========================

After a VG is created, you can create filesystems. You can use smitty or the crfs
and mkfs command.
File systems are confined to a single logical volume.

The journaled file system (JFS) and the enhanced journaled file system (JFS2) are
built into the
base operating system. Both file system types link their file and directory data
to the structure
used by the AIX Logical Volume Manager for storage and retrieval. A difference is
that JFS2 is designed to accommodate
a 64-bit kernel and larger files.

Run lsfs -v jfs2 to determine if your system uses JFS2 file systems.
This command returns no output if it finds only standard file systems.

crfs:
-----

crfs -v VfsType { -g VolumeGroup | -d Device } [ -l LogPartitions ]


-m MountPoint [ -n NodeName ] [ -u MountGroup ] [ -A { yes | no } ] [ -p {ro
| rw } ]
[ -a Attribute= Value ... ] [ -t { yes | no } ]

The crfs command creates a file system on a logical volume within a previously
created volume group.
A new logical volume is created for the file system unless the name of an existing
logical volume is
specified using the -d. An entry for the file system is put into the
/etc/filesystems file.

crfs -v jfs -g(vg) -m(mount poi