To start the software (correctly) and use it to configure cell/grid/flash disks the number of open file descriptors must be increased (a very clear error will be raised otherwise).
As it found googling edit the /etc/sysctl.ctl and add/set
fs.file-max = 65536 |
and edit the /etc/security/limit.conf files and add/set
* soft nofile 65536 * hard nofile 65536 |
To communicate over InfiniBand Oracle uses rds protocol. All the modules of rds must be loaded (and configured to be loaded over machine restarts)
[root@stocell1 ~]# modprobe rds [root@stocell1 ~]# modprobe rds_tcp [root@stocell1 ~]# modprobe rds_rdma |
and permanently editing/creating rds.conf
[root@stocell1 ~]# vi /etc/modprobe.d/rds.conf |
insert line
install rds /sbin/modprobe –ignore-install rds && /sbin/modprobe rds_tcp && /sbin/modprobe rds_rdma |
[EDIT: pay attention to double “–” that sometimes became a single char “–”]
Now, as celladmin user the storage cell software could be started
[root@stocell1 ~]# su – celladmin [celladmin@stocell1 ~]$ cellcli -e alter cell restart services allStopping the RS, CELLSRV, and MS services… CELL-01509: Restart Server (RS) not responding. Starting the RS, CELLSRV, and MS services… Getting the state of RS services… running Starting CELLSRV services… The STARTUP of CELLSRV services was not successful. CELL-01547: CELLSRV startup failed due to unknown reasons. Starting MS services… The STARTUP of MS services was successful. |
The error in not unknown (as stated) but well known and expected
[Required IP parameters missing] |
🙂
So, set the interconnect (= InfiniBand connection) and …
[celladmin@stocell1 ~]$ cellcli -e create cell stocell1 interconnect1=eth1 Cell stocell1 successfully created Starting CELLSRV services… The STARTUP of CELLSRV services was successful. Flash cell disks, FlashCache, and FlashLog will be created… CellDisk FD_00_stocell1 successfully created CellDisk FD_01_stocell1 successfully created CellDisk FD_02_stocell1 successfully created CellDisk FD_03_stocell1 successfully created CellDisk FD_04_stocell1 successfully created CellDisk FD_05_stocell1 successfully created Flash log stocell1_FLASHLOG successfully created Flash cache stocell1_FLASHCACHE successfully created |
(I’m not sure why Flash components are auto configured, but it can be modified later if needed)
Configure cell disks
[celladmin@stocell1 ~]$ cellcli -e create celldisk all CellDisk CD_DISK01_stocell1 successfully created CellDisk CD_DISK02_stocell1 successfully created CellDisk CD_DISK03_stocell1 successfully created CellDisk CD_DISK04_stocell1 successfully created CellDisk CD_DISK05_stocell1 successfully created CellDisk CD_DISK06_stocell1 successfully created CellDisk CD_DISK07_stocell1 successfully created CellDisk CD_DISK08_stocell1 successfully created CellDisk CD_DISK09_stocell1 successfully created CellDisk CD_DISK10_stocell1 successfully created CellDisk CD_DISK11_stocell1 successfully created CellDisk CD_DISK12_stocell1 successfully created |
and grid disks
[celladmin@stocell1 ~]$ cellcli -e create griddisk all harddisk prefix=DATA GridDisk DATA_CD_DISK01_stocell1 successfully created GridDisk DATA_CD_DISK02_stocell1 successfully created GridDisk DATA_CD_DISK03_stocell1 successfully created GridDisk DATA_CD_DISK04_stocell1 successfully created GridDisk DATA_CD_DISK05_stocell1 successfully created GridDisk DATA_CD_DISK06_stocell1 successfully created GridDisk DATA_CD_DISK07_stocell1 successfully created GridDisk DATA_CD_DISK08_stocell1 successfully created GridDisk DATA_CD_DISK09_stocell1 successfully created GridDisk DATA_CD_DISK10_stocell1 successfully created GridDisk DATA_CD_DISK11_stocell1 successfully created GridDisk DATA_CD_DISK12_stocell1 successfully created |
Work done!
Thanks to Steven Lee. His post on the same issue gave me the solution for 2 problems:
– the needed path /var/log/oracle found using the method suggested by Lee
– a strange memory problem with 4GB RAM that was solved simply resizing RAM to 2GB (according with some answers of Lee)
Furthermore the differences between my VM ad his VM helped me to fix the problem related to rds modules (that probably evolved in the meantime)
/*+ esp */
Pingback: Prepare some virtual disks (files!) « Dba Esp
Pingback: Exadata Virtual Test Environment for OCE Prep – storage cell node | Dba Esp
Hi ,
when I created the cell,it didn’t created the flash components .Also,creating celldisk took all the disks which seems logical as those are similar disks .Can you tell me how your flashdisks were created.?
[celladmin@stocell1 ~]$ cellcli -e create cell stocell1 interconnect1=eth1
Cell stocell1 successfully created
Starting CELLSRV services…
The STARTUP of CELLSRV services was successful.
[celladmin@stocell1 ~]$ cellcli
CellCLI> create celldisk all
CellDisk CD_disk1_stocell1 successfully created
CellDisk CD_disk10_stocell1 successfully created
CellDisk CD_disk11_stocell1 successfully created
CellDisk CD_disk12_stocell1 successfully created
CellDisk CD_disk2_stocell1 successfully created
CellDisk CD_disk3_stocell1 successfully created
CellDisk CD_disk4_stocell1 successfully created
CellDisk CD_disk5_stocell1 successfully created
CellDisk CD_disk6_stocell1 successfully created
CellDisk CD_disk7_stocell1 successfully created
CellDisk CD_disk8_stocell1 successfully created
CellDisk CD_disk9_stocell1 successfully created
CellDisk CD_flash1_stocell1 successfully created
CellDisk CD_flash2_stocell1 successfully created
CellDisk CD_flash3_stocell1 successfully created
CellDisk CD_flash4_stocell1 successfully created
CellDisk CD_flash5_stocell1 successfully created
CellDisk CD_flash6_stocell1 successfully created
CellCLI>
In my env create cell command automatically create all flash stuff. I cannot check but I suppose that there is a very complex algorithm to recognize flash disks from cell/disks/raw links: FLASH uppercase 😉 in link name.
I’m not sure but you can simply check that.
Drop cell, recreate links, recreate cell, eventually create flash stuff manually and than create celldisks
Ciao
The idea crossed my mind but i thought it was too stupid to even try..but now the CAPS have worked…Hail Oracle !
Hi,
When i try to create cell with interconnect1=eth1, CELLSRV service cannot startup with error
CellCLI> create cell exapoc interconnect1=eth1
Cell exapoc successfully created
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful.
CELL-01547: CELLSRV startup failed due to unknown reasons.
Wed Jul 30 02:39:50 2014 787 msec State dump completed for CELLSRV
Errors in file /opt/oracle/cell11.2.3.3.1_LINUX.X64_140529.1/log/diag/asm/cell/exapoc/trace/svtrc_6248_0.trc (incident=113):
ORA-00600: internal error code, arguments: [LinuxBlockIO::init], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/cell11.2.3.3.1_LINUX.X64_140529.1/log/diag/asm/cell/exapoc/incident/incdir_113/svtrc_6248_0_i113.trc
Wed Jul 30 02:39:55 2014
State dump interrupted for Cellsrv by RS. It did not complete in 5 seconds.
[RS] monitoring process /opt/oracle/cell11.2.3.3.1_LINUX.X64_140529.1/cellsrv/bin/cellrsomt (pid: 6247) returned with error: 124
[RS] Could not start Service CELLSRV correctly. Try stopping
[RS] Stopped Service CELLSRV
Regards,
Hi,
put some info from trace file so we can try to understand where is the problem !
esp
I can’t stop myself from reading your blogs. I love reading your posts. Spot on with this posting, as you always are. Have they tested your theory? I seriously appreciate individuals like you. This information you are providing is awesome.
hi,
some line from the trace file.
///////////////////
2014-07-30 22:57:18.363997 :00000005: CELLSRV needs 475 hugepages, but there are only 12 available. 2014-07-30 22:57:18.364042 :00000006: CELLSRV trying to reserve 463 more hugepages.
2014-07-30 22:57:18.366826 :00000007: Successfully allocated 902MB of hugepages for buffersLockPool name:FastFileInit::shared_pin_locks type:MUTEX POOL group:147 numLocks:256 nextLockIndex:0 totalLockRefs:0 lockArray:0x7f55dd576590
LockPool name:FastFileInit::in_mem_MD_locks type:RWLOCK POOL group:34 numLocks:256 nextLockIndex:0 totalLockRefs:0 lockArray:0x7f55dd735e60
Writing message type OSS_PIPE_ERR_FAILED_STARTUP_RESTART to OSS->RS pipe
2014-07-30 22:57:20.267087 :00000009: Master: SIGUSR2 delivered by pid – 4219, uid – 0. Dumping call stack, process map, system state, and trace buffers
Writing message type OSS_PIPE_ERR_FAILED_STARTUP_RESTART to OSS->RS pipe
skgznp_write to RSOMT failed, retval=56822
slos 0x7ffffbf891c0 Error Category: 56824
Operation: send
Location: skgznpwm2
DepInfo: Broken pipe
Error Code: 32DDE: Flood control is not active
Incident 129 created, dump file: /opt/oracle/cell11.2.3.3.1_LINUX.X64_140529.1/log/diag/asm/cell/exapoc/incident/incdir_129/svtrc_4231_0_i129.trc
ORA-00600: internal error code, arguments: [LinuxBlockIO::init], [], [], [], [], [], [], [], [], [], [], []
Writing message type OSS_PIPE_ERR_FAILED_STARTUP_RESTART to OSS->RS pipe
skgznp_write to RSOMT failed, retval=56822
slos 0x7ffffbf95200 Error Category: 56824
Operation: send
Location: skgznpwm2
DepInfo: Broken pipe
Error Code: 322014-07-30 22:57:21.679204 :0000000E: CELLSRV error – ORA-600 internal error
/////////////////
ciao,
sorry, but from that info I could guess nothing.
Please put some other info as output of
uname -a
more /etc/hosts
lsmod | grep rds
and detail of VM (memory, …)
Hi,
[celladmin@localhost ~]$ uname -a
Linux localhost.localdomain 2.6.18-371.el5 #1 SMP Mon Sep 30 16:34:30 PDT 2013 x86_64 x86_64 x86_64 GNU/Linux
[celladmin@localhost ~]$ more /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
[celladmin@localhost ~]$ lsmod|grep rds
rds_rdma 106305 0
rds_tcp 48097 0
rds 154921 8 rds_rdma,rds_tcp
rdma_cm 73301 2 rds_rdma,ib_iser
ib_core 107841 7 rds_rdma,ib_iser,rdma_cm,ib_cm,iw_cm,ib_sa,ib_mad
VM info:
2048MB Memory
IDE Controller
IDE Secondary Master (CD/DVD):Empty
SATA Controller
SATA Port 0:stocell1.vmdk (Normal, 25.00 GB)
SATA Port 1:stocell1_DISK01.vmdk (Normal, 500.00 MB)
SATA Port 2:stocell1_DISK02.vmdk (Normal, 500.00 MB)
SATA Port 3:stocell1_DISK03.vmdk (Normal, 500.00 MB)
SATA Port 4:stocell1_DISK04.vmdk (Normal, 500.00 MB)
SATA Port 5:stocell1_DISK05.vmdk (Normal, 500.00 MB)
SATA Port 6:stocell1_DISK06.vmdk (Normal, 500.00 MB)
SATA Port 7:stocell1_DISK07.vmdk (Normal, 500.00 MB)
SATA Port 8:stocell1_DISK08.vmdk (Normal, 500.00 MB)
SATA Port 9:stocell1_DISK09.vmdk (Normal, 500.00 MB)
SATA Port 10:stocell1_DISK10.vmdk (Normal, 500.00 MB)
SATA Port 11:stocell1_DISK11.vmdk (Normal, 500.00 MB)
SATA Port 12:stocell1_DISK12.vmdk (Normal, 500.00 MB)
SATA Port 13:stocell1_FLASH01.vmdk (Normal, 400.00 MB)
SATA Port 14:stocell1_FLASH02.vmdk (Normal, 400.00 MB)
SATA Port 15:stocell1_FLASH03.vmdk (Normal, 400.00 MB)
SATA Port 16:stocell1_FLASH04.vmdk (Normal, 400.00 MB)
SATA Port 17:stocell1_FLASH05.vmdk (Normal, 400.00 MB)
SATA Port 18:stocell1_FLASH06.vmdk (Normal, 400.00 MB)
All seems to be well done. You should debug at deeper level or try rebuilding the cell.
try first rebuilding disks, links, prepare disks with dd, verify permissions.
If you want to play/investigate, debug at deeper level: try with “strace cellcli -e create cell exapoc interconnect1=eth1”.
If you will not have new interesting details, cellcli itself is a script running a java machine that will execute your command.
So “strace” at deeper level (directly java vm+command).
Dear,
May I know hwo to prepare disk with dd? do i need to create partition for each cell disk such as /dev/sdb1 ??
You must create disks in VM as virtual disks, you must link in a specific path, just to be sure do a dd if=/dev/zero of=yourcelldiskhere count =1000
Pay attention to avoid dd-ing your system disk. All is in the post…
Dear all,
I am trying to setup a exadata virtuallab according to this nice link and i get stuck at this point:
[celladmin@exqrcel01 ~]$ cellcli -e create cell exqrcel01 interconnect1=eth1
CELL-02625: Interface eth1 refers to device name .
Device name must be same as Interface name.
[celladmin@exqrcel01 ~]$
[root@exqrcel01 config]# ifconfig -a
eth0 Link encap:Ethernet HWaddr 08:00:27:F4:14:28
inet addr:192.168.2.19 Bcast:192.168.2.255 Mask:255.255.255.0 –> My public IP
……
eth1 Link encap:Ethernet HWaddr 08:00:27:E0:CA:FF —> should be my infiband
inet addr:192.168.56.6 Bcast:192.168.56.255 Mask:255.255.255.0
……
Any idea, how could i get forward?
Thanks and merry Xmas
Ousseini Oumarou
First of all … Merry Christmas !!
Then if you already found the problem and the fix, please post for future help.
Otherwise you should check ifcfg* files in /etc/sysconfig/network-scripts/
because the error seems to be related to network configuration
Hello,
I could configured the cellserver successfully and kfod on the dbserver also return the expected disks. However, the grid installation failed now the 2nd time with the ORA-15080.
Although, i have selected 3 disks (ASM_OCR normal redundancy), I can see ORA-15063 (insufficient number of disks) in asmca Logfile??.
Strange behavior also on the storage server (exqrcel01): CELLSRV went down and the CELLCLI ALERTHISTORY listed an ORA-0600 Error.
This is really a cumbersome task. I do not know if this could be related with my environment settings:
OEL 6.6, CellServer software: cell-11.2.3.3.1_LINUX.X64_140708-1.x86_64.
Virtualbox running on external Hard Disk.
[celladmin@exqrcel01 ~]$ uname -na
Linux exqrcel01 2.6.32-504.el6.x86_64 #1 SMP Tue Oct 14 01:47:47 PDT 2014 x86_64 x86_64 x86_64 GNU/Linux
[celladmin@exqrcel01 ~]$
[celladmin@exqrcel01 ~]$ rpm -qa | grep -i cell
cell-11.2.3.3.1_LINUX.X64_140708-1.x86_64
[celladmin@exqrcel01 ~]$
[root@exqrdb01 Desktop]# /software/grid11204_unzip/stage/ext/bin/kfod disks=all op=disks
WARNING: Using brute force method to determine the size of /dev/raw/rawctl.
There will be performance issues. Please check configuration to determine the cause for the failure of ioctl
——————————————————————————–
Disk Size Path User Group
================================================================================
1: 1024 Mb o/192.168.56.6/DATA_CD_DISK01_exqrcel01
2: 1024 Mb o/192.168.56.6/DATA_CD_DISK02_exqrcel01
3: 1024 Mb o/192.168.56.6/DATA_CD_DISK03_exqrcel01
4: 1024 Mb o/192.168.56.6/DATA_CD_DISK04_exqrcel01
…….
[root@exqrdb01 Desktop]# /u01/app/11.2.0/grid/root.sh
Performing root user operation for Oracle 11g
The following environment variables are set as:
ORACLE_OWNER= grid
ORACLE_HOME= /u01/app/11.2.0/grid
………
Installing Trace File Analyzer
CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘exqrdb01’
CRS-2676: Start of ‘ora.cssdmonitor’ on ‘exqrdb01’ succeeded
CRS-2672: Attempting to start ‘ora.cssd’ on ‘exqrdb01’
CRS-2672: Attempting to start ‘ora.diskmon’ on ‘exqrdb01’
CRS-2676: Start of ‘ora.diskmon’ on ‘exqrdb01’ succeeded
CRS-2676: Start of ‘ora.cssd’ on ‘exqrdb01’ succeeded
Disk Group ASM_OCR creation failed with the following message:
ORA-15018: diskgroup cannot be created
ORA-15080: synchronous I/O operation to a disk failed
Configuration of ASM … failed
see asmca logs at /u01/app/oracle/cfgtoollogs/asmca for details
……
tail -100 /u01/app/oracle/cfgtoollogs/asmca/asmca-141226AM124315.log
[main] [ 2014-12-26 00:44:10.108 CET ] [USMInstance.configureLocalASM:3041] ORA-15032: not all alterations performed
ORA-15017: diskgroup “ASM_OCR” cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup “ASM_OCR”
CellCLI> list cell detail
name: exqrcel01
….
cellsrvStatus: stopped
msStatus: running
rsStatus: running
CellCLI>
——————-
CELLCLI –> Alerthistory
CellCLI> list alerthistory
….
35 2014-12-25T23:44:38+01:00 critical “RS-7445 [Serv MS is absent] [It will be restarted] [] [] [] [] [] [] [] [] [] []”
36 2014-12-26T00:44:18+01:00 critical “ORA-00600: internal error code, arguments: [StorageIdx::getOclSIRegion], [], [], [], [], [], [], [], [], [], [], []”
CellCLI>
I had problem with rds on 6.5. The same for some blog’s readers. I suggest to use oel 5 with el kernel.
Hello,
the error CELL-02625: Interface eth1 refers to device name was the missing entry DEVICE=eth1 in
/etc/sysconfig/network-scripts/ifcfg-eth1.
I do not know why it was missing, probally cause is the clone. Then I cloned the cellserver from an existing virtualbox. After adding the entry, the configuration run successfully.
I hope this information could help.
Regards
Perfect. Thanks for sharing.
Hi, Ousseini Oumarou
I also got this error. This error relates to network related error.
The main impact of this error resides in /etc/sysconfig/network-scripts/ifcfg-xxx network file information.
The Exadata binaries reads the data from this file and if found some unusual then throws an error.
This can be resolved by changing/updating the information in the file.
1. In some case it may require to change the “NAME” field that file that is differs from device name.
2. If some cases “DEVICE” keyword is missing in the file. Require to update in the file.
3. There may be a miss match in Mac address of of the ethernet card.
In my case the issue is resolved by adding the “DEVICE=eth2” value in the file as it was not present in OEL 6.10 network file information.
Dear Raymond
sysctl -w fs.aio-max-nr=50000000
and also put into /etc/sysctl.conf
will solve your problem.
Hi
I am unable to create cell with error connecting to MS. It complains about the port 8888, but the port is listening. Any ideas and suggestions?
[celladmin@stocell1 ~]$ cellcli -e alter cell restart services all
Stopping the RS, CELLSRV, and MS services…
The SHUTDOWN of services was successful.
Starting the RS, CELLSRV, and MS services…
Getting the state of RS services… running
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful.
CELL-01547: CELLSRV startup failed due to unknown reasons.
Starting MS services…
The STARTUP of MS services was successful.
[celladmin@stocell1 ~]$ cellcli -e create cell stocell1 interconnect1=eth1
CELL-01514: Connect Error. Verify that Management Server is listening at the specified HTTP port: 8888.
[celladmin@stocell1 ~]$
Below are my environment and info from the log:
[root@stocell1 modprobe.d]# lsmod |grep rds
rds_rdma 80877 0
rdma_cm 36834 1 rds_rdma
ib_core 74355 6 rds_rdma,rdma_cm,ib_cm,iw_cm,ib_sa,ib_mad
rds_tcp 10293 0
rds 96610 2 rds_rdma,rds_tcp
[root@stocell1 modprobe.d]# netstat -anp|grep 8888
tcp 0 0 127.0.0.1:34027 127.0.0.1:8888 TIME_WAIT –
tcp 0 0 127.0.0.1:34032 127.0.0.1:8888 TIME_WAIT –
tcp 0 0 ::ffff:127.0.0.1:8888 :::* LISTEN 6540/java
[root@stocell1 modprobe.d]# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
127.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
0.0.0.0 192.168.1.1 0.0.0.0 UG 0 0 0 eth0
Below are from ms-odl.log
[root@stocell1 modprobe.d]#
[2014-10-26T11:26:53.828-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSCoreImpl] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] lunstat: normal changeStat: found lunname: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/stocell1_DISK01
[2014-10-26T11:26:53.828-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSCoreImpl] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] In lunFound: LUN /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/stocell1_DISK01, os devicename: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/stocell1_DISK01
[2014-10-26T11:26:54.007-05:00] [ossmgmt] [WARNING] [] [ms.core.MSCoreImpl] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] Tuning Block IO failed on device: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/stocell1_DISK01
[2014-10-26T11:26:54.008-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] msosscomm ctx not valid, trying to init
[2014-10-26T11:26:54.037-05:00] [ossmgmt] [ERROR] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] Required IP parameters not configured. Err: 36
[2014-10-26T11:26:54.039-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] _ms_mprotect_corrupt_buf is set to TRUE
[2014-10-26T11:26:54.044-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSCoreImpl] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] Error while trying to sync diflist: oracle.ossmgmt.common.core.SageException: CELL-02627: There is a communication error between MS and CELLSRV. Configuration file cellinit.ora is malformed or does not include required information.[[
oracle.ossmgmt.common.core.SageException: CELL-02627: There is a communication error between MS and CELLSRV. Configuration file cellinit.ora is malformed or does not include required information.
at oracle.ossmgmt.ms.core.MSOSSComm.static_sendrecv(Native Method)
at oracle.ossmgmt.ms.core.MSOSSComm.isValidCellDisk(MSOSSComm.java:2989)
at oracle.ossmgmt.ms.core.MSCoreImpl.isValidCellDisk(MSCoreImpl.java:2043)
at oracle.ossmgmt.ms.core.MSCoreImpl.isValidSageLun(MSCoreImpl.java:2070)
at oracle.ossmgmt.ms.core.MSCoreImpl.lunFound(MSCoreImpl.java:2920)
at oracle.ossmgmt.ms.core.MSCoreImpl.getNewDiskAdpState(MSCoreImpl.java:8320)
at oracle.ossmgmt.ms.core.MSDiskPollTimerTask.run(MSDiskPollTimerTask.java:108)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
]]
[2014-10-26T11:30:13.997-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] msosscomm ctx not valid, trying to init
[2014-10-26T11:30:13.999-05:00] [ossmgmt] [ERROR] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] Required IP parameters not configured. Err: 36
[2014-10-26T11:30:14.000-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] _ms_mprotect_corrupt_buf is set to TRUE
[2014-10-26T11:30:14.008-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] msosscomm ctx not valid, trying to init
[2014-10-26T11:30:14.011-05:00] [ossmgmt] [ERROR] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] Required IP parameters not configured. Err: 36
[2014-10-26T11:30:14.012-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] _ms_mprotect_corrupt_buf is set to TRUE
[2014-10-26T11:30:33.442-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] msosscomm ctx not valid, trying to init
[2014-10-26T11:30:33.449-05:00] [ossmgmt] [ERROR] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] Required IP parameters not configured. Err: 36
[2014-10-26T11:30:33.450-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] _ms_mprotect_corrupt_buf is set to TRUE
[root@stocell1 trace]# uname -a
Linux stocell1 2.6.32-431.el6.x86_64 #1 SMP Wed Nov 20 23:56:07 PST 2013 x86_64 x86_64 x86_64 GNU/Linux
[root@stocell1 trace]# cat /etc/hosts
127.0.0.1 stocell1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
Below are from Alert log:
RS-7445 [Required IP parameters missing] [Check cellinit.ora] [] [] [] [] [] [] [] [] [] []
Incident details in: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/log/diag/asm/cell/stocell1/incident/incdir_169/rstrc_6529_4_i169.trc
Sun Oct 26 12:17:47 2014
RSBK version=11.2.3.2.1,label=OSS_11.2.3.2.1_LINUX.X64_130109,Wed_Jan__9_06:09:48_PST_2013
[RS] Started Service RS_BACKUP with pid 6539
[RS] Kill previous monitoring process for core RS
Sun Oct 26 12:17:47 2014
[RS] Started monitoring process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrssmt with pid 6544
Sweep [inc][169]: completed
[RS] Required IP parameters not configured in cellinit.ora. Err: 36
OS Hugepage status:
Total/free hugepages available=12/12; hugepage size=2048KB
[RS] Start service CELLSRV failed with error: -74.
Sun Oct 26 12:17:47 2014
Could not connect to MS socket. Communication with MS may be degraded. errno=115
[RS] monitoring process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrsomt (pid: 0) returned with error: 124
[RS] Started Service MS with pid 6540
[root@stocell1 raw]# ls -l
total 0
lrwxrwxrwx 1 root root 8 Oct 25 22:26 stocell1_DISK01 -> /dev/sdb
lrwxrwxrwx 1 root root 8 Oct 25 22:27 stocell1_DISK02 -> /dev/sdc
lrwxrwxrwx 1 root root 8 Oct 25 22:27 stocell1_DISK03 -> /dev/sdd
lrwxrwxrwx 1 root root 8 Oct 25 22:27 stocell1_DISK04 -> /dev/sde
lrwxrwxrwx 1 root root 8 Oct 25 22:28 stocell1_DISK05 -> /dev/sdf
lrwxrwxrwx 1 root root 8 Oct 25 22:28 stocell1_DISK06 -> /dev/sdg
lrwxrwxrwx 1 root root 8 Oct 25 22:31 stocell1_DISK07 -> /dev/sdh
lrwxrwxrwx 1 root root 8 Oct 25 22:31 stocell1_DISK08 -> /dev/sdi
lrwxrwxrwx 1 root root 8 Oct 25 22:31 stocell1_DISK09 -> /dev/sdj
lrwxrwxrwx 1 root root 8 Oct 25 22:31 stocell1_DISK10 -> /dev/sdk
lrwxrwxrwx 1 root root 8 Oct 25 22:32 stocell1_DISK11 -> /dev/sdl
lrwxrwxrwx 1 root root 8 Oct 25 22:32 stocell1_DISK12 -> /dev/sdm
lrwxrwxrwx 1 root root 8 Oct 25 22:34 stocell1_DISK13 -> /dev/sdn
lrwxrwxrwx 1 root root 8 Oct 25 22:34 stocell1_DISK14 -> /dev/sdo
lrwxrwxrwx 1 root root 8 Oct 25 22:34 stocell1_DISK15 -> /dev/sdp
lrwxrwxrwx 1 root root 8 Oct 25 22:34 stocell1_DISK16 -> /dev/sdq
lrwxrwxrwx 1 root root 8 Oct 25 22:34 stocell1_DISK17 -> /dev/sdr
lrwxrwxrwx 1 root root 8 Oct 25 22:34 stocell1_DISK18 -> /dev/sds
[root@stocell1 raw]#
[root@stocell1 raw]# fdisk -l |grep “B,”
Disk /dev/sda: 26.8 GB, 26843545600 bytes
Disk /dev/sdb: 524 MB, 524288000 bytes
Disk /dev/sdc: 524 MB, 524288000 bytes
Disk /dev/sdd: 524 MB, 524288000 bytes
Disk /dev/sde: 524 MB, 524288000 bytes
Disk /dev/sdf: 524 MB, 524288000 bytes
Disk /dev/sdg: 524 MB, 524288000 bytes
Disk /dev/sdh: 524 MB, 524288000 bytes
Disk /dev/sdi: 524 MB, 524288000 bytes
Disk /dev/sdj: 524 MB, 524288000 bytes
Disk /dev/sdk: 524 MB, 524288000 bytes
Disk /dev/sdl: 524 MB, 524288000 bytes
Disk /dev/sdm: 524 MB, 524288000 bytes
Disk /dev/sdn: 419 MB, 419430400 bytes
Disk /dev/sdo: 419 MB, 419430400 bytes
Disk /dev/sdp: 419 MB, 419430400 bytes
Disk /dev/sdq: 524 MB, 524288000 bytes
Disk /dev/sds: 419 MB, 419430400 bytes
Disk /dev/sdr: 419 MB, 419430400 bytes
Disk /dev/mapper/vg_stocell3-lv_root: 23.6 GB, 23630708736 bytes
Disk /dev/mapper/vg_stocell3-lv_swap: 2684 MB, 2684354560 bytes
~
Hi,
Usually such kind of error are related to host/cellinit/eth configuration. Post them or check with info in the blog.
You should also fix disk names: storage cell sw automatically recognizes flash disks by name with FLASH string inside… 🙂 it seems a joke but…
Hi,
Thanks for prompt reply. I will fix the disk name shortly but I am always confused with the netwok. I thought that there was some network issue but couldn’t figure out. I could ping HOST (192.168..1.5) from stocell1 (192.168.1.52), and ping back. I tried both localhost IP 127.0.0.1 and static IP 192.168.1.52 for the VM. but didn’t work. Your help is greatly appreciated.
Here is the HOST ifcfg-eth0:
DEVICE=eth0
TYPE=Ethernet
UUID=315ac4fe-111f-4542-a937-dab7c0567f68
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=yes
IPV6INIT=no
NAME=”System eth0″
NETMASK=255.255.255.0
USERCTL=no
HWADDR=00:1F:29:DE:8B:3E
IPADDR=192.168.1.5
PREFIX=24
GATEWAY=192.168.1.1
DNS1=192.168.1.1
LAST_CONNECT=1413740127
HOST ifcfg-eth1:
DEVICE=eth1
TYPE=Ethernet
UUID=e7d3e3c9-ad13-472b-900e-1b91486c45c0
ONBOOT=no
NM_CONTROLLED=yes
BOOTPROTO=dhcp
DEFROUTE=yes
IPV4_FAILURE_FATAL=yes
IPV6INIT=no
NAME=”System eth1″
HWADDR=00:1F:29:DE:8B:42
PEERDNS=yes
PEERROUTES=yes
VM stocell1: ifcfg-eth0:
DEVICE=eth0
TYPE=Ethernet
UUID=e7fd04da-9ce8-4143-9f99-29ebc3372c71
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=dhcp
DEFROUTE=yes
IPV4_FAILURE_FATAL=yes
IPV6INIT=no
NAME=”System eth0″
HWADDR=08:00:27:38:C4:DC
PEERDNS=yes
PEERROUTES=yes
LAST_CONNECT=1414337451
VM stocell1: ifcfg-eth1
DEVICE=eth1
TYPE=Ethernet
UUID=e7c64609-0f37-4c04-ad9f-984201e6bc49
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=none
IPADDR=192.168.1.52
PREFIX=24
GATEWAY=192.168.1.1
DEFROUTE=yes
IPV4_FAILURE_FATAL=yes
IPV6INIT=no
NAME=”System eth1″
HWADDR=08:00:27:56:63:36
LAST_CONNECT=1414349747
On VM stocell1, the network is below:
Adapter1 is attached to “Bridged Adapter” and the name is “eth0”
Adapter2 is attached to “Host-only Adapter” and the name is “vboxnet0”
Hi,
I fixed the Disk name to FLASH, and fixed some errors, now the config is below, but I am still getting the same error about the HTTP port 8888.
HOST Virtualbox Setting
Host-only Networks vboxnet0 IPv4 =192.168.56.1,
IPv4 Network Mask=255.255.255.0
IPv6 = (there are some numbers, can’t remove them),
IPv6 Network Mask Length=64
eth0 Method: Manual
IPv4=192.168.1.5
Netmask=255.255.255.0
Gateway=192.168.1.1
VM stocell1 Setting
Host-only Networks vboxnet0
eth0 Method: Automatic (DHCP)
eth1 IPv4 = 192.168.56.101
Netmask=255.255.255.0
Gateway=192.168.1.1
I noticed in the Cell install log “.install_log.txt”, the installation inflated oc4jpatch to /tmp and tried to apply, but it says
apply -jdk /usr/java/jdk1.5.0_15/ -oh /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/oc4j/ms -silent /tmp/oc4jpatch/7439847 command failed: No such file or directory
The oc4jpatch actually does not exist in /tmp, wonder if it was deleted unapplied. Not sure if it matters.
And in cellinit.ora:
version=0.0
HTTP_PORT=8888
bbuChargeThreshold=800
SSL_PORT=23943
RMI_PORT=23791
bbuTempThreshold=60
DEPLOYED=TRUE
JMS_PORT=9127
BMC_SNMP_PORT=162
OK, figured out what happened to http port 8888. After reboot, the firewall came up. shutdown Firewall, SELINUX, and shutdown IPv6 fixed that issue. However, I still don’t seem to get passed the Cell creation part. Below is the error:
cellcli -e create cell interconnect1=eth1
CELL-02598: Ipaddress/Netmask attribute is not properly configured for interconnect eth1
Here is eth1:
Address: 192.168.56.12 ( this is the fix IP that I gave to stocell2)
Netmask: 255.255.255.0
Gateway: 192.168.1.1 (What should this Gateway IP be, the router IP 192.168.1.1, or the Host IP 192.168.56.1 ? It doesn’t
matter either way however)
Any ideas?
I’m not sure if there is something related to os version you choose. By the way the simulated infiniband should be … 56.xxx on your env. Both machines should have an ip on that network. Correct routing can be configured also in eth1-route and eth1-rule configuration files (under same path of eth1 net config file. But I think is only a performance problem if you can ping/ssh from a machine to the other
You are 100% correct that it was indeed the OS version issue. I lowered it to 5.10 and worked smoothly. Thanks!
I’m happy that was useful. Have fun
Hi,
First i would like to say that this is a wonderful way of making the virtual environment for oracle exadata. I follow all the steps but got stacked with following. It would greatly appreciated if you could help me.
[celladmin@localhost ~]$ cellcli -e alter cell restart services all
Stopping the RS, CELLSRV, and MS services…
The SHUTDOWN of services was successful.
Starting the RS, CELLSRV, and MS services…
Getting the state of RS services… running
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful. Error: Start Timed out
Starting MS services…
The STARTUP of MS services was successful.
here is the output from alert.log file:
Cache Allocation: BufferSize: 32768. Num buffers: 5000. Start Address: 2AE48B076000
Cache Allocation: BufferSize: 65536. Num buffers: 5000. Start Address: 2AE494CB7000
Cache Allocation: BufferSize: 10485760. Num buffers: 7. Start Address: 2AE4A8538000
CELL communication is configured to use 1 interface(s):
192.168.56.102
[RS] Started Service MS with pid 5717
Sun Apr 19 16:33:47 2015
IPC version: Oracle UDP/IP (generic)
IPC Vendor 1 Protocol 2
Version 4.1
Sun Apr 19 16:34:16 2015
[RS] Process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrsomt (pid: 5714) received exception [signal num: 14] [ADDR:0x0]
Sun Apr 19 16:34:16 2015
Sun Apr 19 16:34:16 2015State dump completed for Cellsrv
Sun Apr 19 16:34:16 2015
State dump signal delivered to Cellsrv by RS.
Sun Apr 19 16:34:21 2015
State dump interrupted for Cellsrv by RS. It did not complete in 5 seconds.
Clean shutdown signal delivered to OSS
[RS] monitoring process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrsomt (pid: 0) returned with error: 124
Ciao,
please describe your env: OS version, RAM, cell/db version, IP setings if different from the post…
os version: same as you described in your site
Linux localhost.localdomain 2.6.18-371.el5 #1 SMP Mon Sep 30 16:34:30 PDT 2013 x86_64 x86_64 x86_64 GNU/Linux
memory info: this time i bump it upto 4gb
MemTotal: 4050948 kB
MemFree: 1443844 kB
Buffers: 132368 kB
Cached: 840552 kB
SwapCached: 0 kB
Active: 987032 kB
Inactive: 535028 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 4050948 kB
LowFree: 1443844 kB
SwapTotal: 4095992 kB
SwapFree: 4095992 kB
Dirty: 76 kB
Writeback: 0 kB
AnonPages: 549172 kB
Mapped: 89868 kB
Slab: 78532 kB
PageTables: 29480 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
CommitLimit: 5647352 kB
Committed_AS: 1906168 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 47568 kB
VmallocChunk: 34359690275 kB
HugePages_Total: 463
HugePages_Free: 463
HugePages_Rsvd: 451
Hugepagesize: 2048 kB
cell/db version: same as you described in your site
ip configuration: i have used an static ip for eth1 192.168.56.50
[root@localhost ~]# ifconfig
eth0 Link encap:Ethernet HWaddr 08:00:27:14:B8:D1
inet addr:192.168.1.104 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:fe14:b8d1/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:22807 errors:0 dropped:0 overruns:0 frame:0
TX packets:14265 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:25594517 (24.4 MiB) TX bytes:2286503 (2.1 MiB)
eth1 Link encap:Ethernet HWaddr 08:00:27:BA:79:49
inet addr:192.168.56.50 Bcast:192.168.56.255 Mask:255.255.255.0
inet6 addr: fe80::a00:27ff:feba:7949/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1143 errors:0 dropped:0 overruns:0 frame:0
TX packets:872 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:133956 (130.8 KiB) TX bytes:118559 (115.7 KiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:22806 errors:0 dropped:0 overruns:0 frame:0
TX packets:22806 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:10152347 (9.6 MiB) TX bytes:10152347 (9.6 MiB)
create cell error:
[celladmin@localhost trace]$ cellcli -e create cell exacell interconnect1=eth1
Cell exacell successfully created
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful. Error: Start Timed out
restart cell error:
[celladmin@localhost trace]$ cellcli -e alter cell restart services all
Stopping the RS, CELLSRV, and MS services…
The SHUTDOWN of services was successful.
Starting the RS, CELLSRV, and MS services…
Getting the state of RS services… running
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful. Error: Start Timed out
Starting MS services…
The STARTUP of MS services was successful.
output from alert.log file:
[celladmin@localhost trace]$ tail -50 alert.log
[RS] Started Service RS_BACKUP with pid 22515
[RS] Kill previous monitoring process for core RS
Mon Apr 20 22:57:21 2015
[RS] Started monitoring process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrssmt with pid 22524
Mon Apr 20 22:57:21 2015
Successfully setting event parameter –
Mon Apr 20 22:57:21 2015
Successfully setting event parameter –
CELLSRV process id=22517
CELLSRV cell host name=localhost.localdomain
CELLSRV version=11.2.3.2.1,label=OSS_11.2.3.2.1_LINUX.X64_130109,Wed_Jan__9_06:09:48_PST_2013
OS Hugepage status:
Total/free hugepages available=451/451; hugepage size=2048KB
OS Stats: Physical memory: 3956 MB. Num cores: 1
CELLSRV configuration parameters:
version=0.0
Physical memory on machine: 3956 MB.
Memory reserved for cellsrv: 2356 MBMemory for other processes: 1600 MB.
celldisk policy config read from /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/deploy/config/cdpolicy.dat with ver no. 1 and pol no. 0
Auto Online Feature 1.3
CellServer MD5 Binary Checksum: 4701d6c7fd467f39a4f5e0dc2a4370d2
OS Hugepage status:
Total/free hugepages available=463/463; hugepage size=2048KB
MS_ALERT HUGEPAGE CLEAR
Cache Allocation: Num 1MB hugepage buffers: 900 Num 1MB non-hugepage buffers: 0
Cache Allocation: BufferSize: 512. Num buffers: 5000. Start Address: 2B83AD753000
Cache Allocation: BufferSize: 2048. Num buffers: 5000. Start Address: 2B83AD9C5000
Cache Allocation: BufferSize: 4096. Num buffers: 5000. Start Address: 2B83AE38A000
Cache Allocation: BufferSize: 8192. Num buffers: 10000. Start Address: 2B83AF713000
Cache Allocation: BufferSize: 16384. Num buffers: 5000. Start Address: 2B83B4534000
Cache Allocation: BufferSize: 32768. Num buffers: 5000. Start Address: 2B83B9355000
Cache Allocation: BufferSize: 65536. Num buffers: 5000. Start Address: 2B83C2F96000
Cache Allocation: BufferSize: 10485760. Num buffers: 7. Start Address: 2B83D6817000
CELL communication is configured to use 1 interface(s):
192.168.56.50
[RS] Started Service MS with pid 22522
Mon Apr 20 22:57:32 2015
IPC version: Oracle UDP/IP (generic)
IPC Vendor 1 Protocol 2
Version 4.1
Mon Apr 20 22:58:01 2015
[RS] Process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrsomt (pid: 22516) received exception [signal num: 14] [ADDR:0x0]
Mon Apr 20 22:58:01 2015
Mon Apr 20 22:58:01 2015State dump completed for Cellsrv
Mon Apr 20 22:58:01 2015
State dump signal delivered to Cellsrv by RS.
Mon Apr 20 22:58:06 2015
State dump interrupted for Cellsrv by RS. It did not complete in 5 seconds.
Clean shutdown signal delivered to OSS
[RS] monitoring process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrsomt (pid: 0) returned with error: 124
Sorry, I didn’t see that before.
I have no more simulator up and running, then I cannot check at tis moment network configuration.
But it sound strange to me that
>>CELLSRV cell host name=localhost.localdomain
probably this is translated in 127.0.0.1 or 192.168.1.104.
You should probably configure host name according with eth1 …
I’m not sure, I will check as soon as I can get & run the VMs.
Hi,
Thanks for your suggestion it worked. I was able to start up the cellsrv. But now the problem is it won’t work after i restart. So if i need to restart the vm i need to reinstall it otherwise i get the same error. firewall is disable. Can you provide any clue?
thanks
Omar
Can you send us your network configuration?
Hosts, eth? Config files under /etc/… , output of commands like hostname, ifconfig -a, …
Do you see errors in logs? Is your Ram enough?
Thanks a lot for putting up a wonderful page.
Quick question regarding the step that you mentioned in your comments regarding the creation of Flash Cache where you mentioned that “FLASH uppercase in link name.”
Can you please highlight in detail what you meant there, and how to create flash cache. You help is appreciated.
Mike
Hi Mike,
If I well remember was related to file names for simulating all cell disks, flash and not.
Cellsrv expects to find there a symbolic link to the real device (that we don’t have!). That symbolic link for us is directly a file that will be used as device.
By experimental way we find that if you use upper case FLASH in the file name that file will be considered as a flash disk.
Having flash disks (or something like that 😉 ) flash cache can be Configured, automatically by cellsrv o by commands.
Thanks, it worked like a charm
I am able to finally create flashcache / flashclog using the tip that you provided.
Great.
Have fun !
great article.
I am getting the same error that omar reported, but haven’t see how omar/you resolved this issue.
appreciate you update on this.
[celladmin@exacell trace]$ cellcli -e create cell exacell interconnect1=eth1
Cell exacell successfully created
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful. Error: Start Timed out
[celladmin@exacell trace]$ tail -f alert*
Tue Aug 25 12:12:50 2015
[RS] Process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrsomt (pid: 9348) received exception [signal num: 14] [ADDR:0x0]
Tue Aug 25 12:12:50 2015
Tue Aug 25 12:12:50 2015State dump completed for Cellsrv
Tue Aug 25 12:12:50 2015
State dump signal delivered to Cellsrv by RS.
Tue Aug 25 12:12:55 2015
State dump interrupted for Cellsrv by RS. It did not complete in 5 seconds.
Clean shutdown signal delivered to OSS
[RS] monitoring process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrsomt (pid: 0) returned with error: 124
[celladmin@exacell trace]$ ifconfig
eth0 Link encap:Ethernet HWaddr 08:00:27:FA:FF:E5
inet addr:192.168.56.199 Bcast:192.168.56.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:356 errors:0 dropped:1 overruns:0 frame:0
TX packets:229 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:36370 (35.5 KiB) TX bytes:32071 (31.3 KiB)
eth1 Link encap:Ethernet HWaddr 08:00:27:71:A7:9E
inet addr:192.168.56.50 Bcast:192.168.56.255 Mask:255.255.255.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:21 errors:0 dropped:0 overruns:0 frame:0
TX packets:11 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1304 (1.2 KiB) TX bytes:462 (462.0 b)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:58998 errors:0 dropped:0 overruns:0 frame:0
TX packets:58998 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:10518775 (10.0 MiB) TX bytes:10518775 (10.0 MiB)
[celladmin@exacell trace]$ cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.56.199 exacell.localhost.com exacell
[celladmin@exacell trace]$ hostname
exacell.localhost.com
[celladmin@exacell trace]$ cat /proc/meminfo
MemTotal: 6074532 kB
MemFree: 594384 kB
Buffers: 2672 kB
Cached: 67376 kB
SwapCached: 8220 kB
Active: 1671668 kB
Inactive: 507560 kB
Active(anon): 1644512 kB
Inactive(anon): 468640 kB
Active(file): 27156 kB
Inactive(file): 38920 kB
Unevictable: 20104 kB
Mlocked: 7840 kB
SwapTotal: 6094844 kB
SwapFree: 6060300 kB
Dirty: 336 kB
Writeback: 0 kB
AnonPages: 2121356 kB
Mapped: 37240 kB
Shmem: 1804 kB
Slab: 69792 kB
SReclaimable: 28164 kB
SUnreclaim: 41628 kB
KernelStack: 3240 kB
PageTables: 29172 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 7588940 kB
Committed_AS: 4708508 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 72944 kB
VmallocChunk: 34359654392 kB
HardwareCorrupted: 0 kB
AnonHugePages: 1351680 kB
HugePages_Total: 1507
HugePages_Free: 1507
HugePages_Rsvd: 1501
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 10176 kB
DirectMap2M: 6232064 kB
You point to eth1, with ip …50 but in your host the name is related to …199
I cannot check because the simulator was created to study and then destroyed, but the issue should be config related.
Is it required to create griddisks for RECO asm diskgroup? Thanks
No, it isn’t!
I am facing below ORA-600 issues while installing Database in Exadata. Could you please help to resolve the issues.
celladmin@cell1 cell11.2.3.3.0_LINUX.X64_131014.1]$ cellcli -e list alerthistory
1 2015-09-12T15:21:17+05:30 critical “RS-700 [No IP found in Exadata config file] [Check cellinit.ora] [] [] [] [] [] [] [] [] [] []”
2 2015-09-12T15:32:07+05:30 critical “RS-700 [No IP found in Exadata config file] [Check cellinit.ora] [] [] [] [] [] [] [] [] [] []”
3 2015-09-12T15:43:08+05:30 critical “RS-700 [No IP found in Exadata config file] [Check cellinit.ora] [] [] [] [] [] [] [] [] [] []”
4 2015-09-12T15:51:38+05:30 critical “RS-700 [No IP found in Exadata config file] [Check cellinit.ora] [] [] [] [] [] [] [] [] [] []”
5 2015-09-12T16:02:47+05:30 critical “RS-700 [No IP found in Exadata config file] [Check cellinit.ora] [] [] [] [] [] [] [] [] [] []”
6 2015-09-12T17:06:18+05:30 critical “ORA-00600: internal error code, arguments: [StorageIdx::getOclSIRegion], [], [], [], [], [], [], [], [], [], [], []”
Uhm,
We don’t have enough info.
Memory, network conf, cellinit.ora, …
Hi
I am having problem creating cell, getting this error in alert.log as (below), could you please help.
OS: OEL 5.10 (Linux cell1.test 2.6.39-400.264.5.el5uek #1 )
Cell Software: cell11.2.3.3.1
[RS] Started monitoring process /opt/oracle/cell11.2.3.3.1_LINUX.X64_140708/cellsrv/bin/cellrsomt with pid 6971
Mon Dec 14 10:34:54 2015
Successfully setting event parameter –
Mon Dec 14 10:34:54 2015
Successfully setting event parameter –
CELLSRV process id=6972
CELLSRV cell host name=cell1.test
CELLSRV version=11.2.3.3.1,label=OSS_11.2.3.3.1_LINUX.X64_140708,Tue_Jul__8_04:01:56_PDT_2014
CELLSRV version md5: 32ea01399bfb4c21a7a732e1946701c3
OS Stats: Physical memory: 5838 MB. Num cores: 2
OS Hugepage status:
Total/free hugepages available=1513/1501; hugepage size=2048KB
CELLSRV configuration parameters:
version=0.0
Physical memory on machine: 5838 MB.
Memory reserved for cellsrv: 4238 MBMemory for other processes: 1600 MB.
Running on simulated hardware in production environment
celldisk policy config read from /opt/oracle/cell11.2.3.3.1_LINUX.X64_140708/cellsrv/deploy/config/cdpolicy.dat with ver no. 2 and pol no. 0
Auto Online Feature 1.6
OS Hugepage status:
Total/free hugepages available=1525/1513; hugepage size=2048KB
MS_ALERT HUGEPAGE CLEAR
Cache Allocation: Num 1MB hugepage buffers: 3000 Num 1MB non-hugepage buffers: 0
Cache Allocation: BufferSize: 512. Num buffers: 5000. Start Address: 7F317A288000
Cache Allocation: BufferSize: 2048. Num buffers: 5000. Start Address: 7F317A4FA000
Cache Allocation: BufferSize: 4096. Num buffers: 5000. Start Address: 7F317AEBF000
Cache Allocation: BufferSize: 8192. Num buffers: 10000. Start Address: 7F317C248000
Cache Allocation: BufferSize: 16384. Num buffers: 5000. Start Address: 7F3181069000
Cache Allocation: BufferSize: 32768. Num buffers: 5000. Start Address: 7F3185E8A000
Cache Allocation: BufferSize: 65536. Num buffers: 5000. Start Address: 7F318FACB000
Cache Allocation: BufferSize: 10485760. Num buffers: 7. Start Address: 7F31A334C000
CELL communication is configured to use 1 interface(s):
192.168.3.100
CELL IP affinity details:
NUMA status: non-NUMA system
cellaffinity.ora status: N/A
CELL communication will use 1 IP group(s):
Grp 0: *192.168.3.100
Mon Dec 14 10:35:04 2015
IPC version: Oracle UDP/IP (generic)
IPC Vendor 1 Protocol 2
Version 4.1
Mon Dec 14 10:35:34 2015
[RS] Process /opt/oracle/cell11.2.3.3.1_LINUX.X64_140708/cellsrv/bin/cellrsomt (pid: 6971) received exception [signal num: 14] [ADDR:0x0]
Mon Dec 14 10:35:34 2015
Mon Dec 14 10:35:34 2015 70 msec State dump completed for CELLSRV
Mon Dec 14 10:35:34 2015
State dump signal delivered to Cellsrv by RS.
Mon Dec 14 10:35:39 2015
State dump interrupted for Cellsrv by RS. It did not complete in 5 seconds.
Clean shutdown signal delivered to CELLSRV by pid – 4438, tid – 0
[RS] monitoring process /opt/oracle/cell11.2.3.3.1_LINUX.X64_140708/cellsrv/bin/cellrsomt (pid: 6971) returned with error: 124
Regards
Amit
Hi All,
First of Thanks for creating such great document .Able to create successful cell storage till this step.
I had face same issue which other face :-
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful. Error: Start Timed out
Issue was ETH1 IP starting value and value differ in /etc/hosts . so, have added another entry with ETH1 value and reslove name as stocell1.localhost.com stocell1 + network service restart. Still got error after that.
So, when just change the value in /etc/hosts as stocell1.localhost.com stocell11 and network service restart + VMWARE Machine restart and after retry with cellcli -e create cell stocell1 interconnect1=eth1 . it’s boom!!!! Worked!!! after that all above command works and all are up.
I am not able to understand what is real issue with stocell1 name ?
Below is my current storage cell config:-(can you please say all are config correct or this is wrong ? i am confused here).
CellCLI> list cell detail
name: stocell11
bbuTempThreshold: 60
bbuChargeThreshold: 800
bmcType: absent
cellVersion: OSS_11.2.3.2.1_LINUX.X64_130109
cpuCount: 1
diagHistoryDays: 7
fanCount: 1/1
fanStatus: normal
flashCacheMode: WriteThrough
id: c9938562-c4f9-4bcd-8866-cd07323a3bf4
interconnectCount: 2
interconnect1: eth1
iormBoost: 0.0
ipaddress1: 192.168.19.129/24
kernelVersion: 2.6.32-300.10.1.el5uek
makeModel: Fake hardware
metricHistoryDays: 7
offloadEfficiency: 1.0
powerCount: 1/1
powerStatus: normal
releaseVersion: 11.2.3.2.1
releaseTrackingBug: 14522699
status: online
temperatureReading: 0.0
temperatureStatus: normal
upTime: 0 days, 1:06
cellsrvStatus: running
msStatus: running
rsStatus: running
CellCLI> list celldisk
CD_01_stocell11 normal
CD_02_stocell11 normal
CD_03_stocell11 normal
CD_04_stocell11 normal
CD_05_stocell11 normal
CD_06_stocell11 normal
CD_07_stocell11 normal
CD_09_stocell11 normal
CD_10_stocell11 normal
CD_11_stocell11 normal
CD_12_stocell11 normal
CD_13_stocell11 normal
CD_14_stocell11 normal
CD_15_stocell11 normal
CD_16_stocell11 normal
CD_17_stocell11 normal
CD_18_stocell11 normal
CD_19_stocell11 normal
CellCLI> list griddisk
DATA_CD_01_stocell11 active
DATA_CD_02_stocell11 active
DATA_CD_03_stocell11 active
DATA_CD_04_stocell11 active
DATA_CD_05_stocell11 active
DATA_CD_06_stocell11 active
DATA_CD_07_stocell11 active
DATA_CD_09_stocell11 active
DATA_CD_10_stocell11 active
DATA_CD_11_stocell11 active
DATA_CD_12_stocell11 active
DATA_CD_13_stocell11 active
Thanks
Deep
Typo Above :- Issue command was cellcli -e create cell stocell11 interconnect1=eth1
Hello ,
I see the error “ORA-00600: internal error code, arguments: [LinuxBlockIO::init]” in the blog, i was able to solve the error by adding the below entry in cellinit.ora
_cellrsdef_heartbeat_timeout=10
Thanks,
Krish
Hi,
Thanks for the excellent post. I tried to setup in my lab and running into following errors
[celladmin@stocell1 ~]$ cellcli -e alter cell restart services all
Stopping the RS, CELLSRV, and MS services…
The SHUTDOWN of services was successful.
Starting the RS, CELLSRV, and MS services…
Getting the state of RS services… running
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful.
CELL-01531: Unable to parse the cellinit.ora file due to incorrect parameters in the file.
Starting MS services…
The STARTUP of MS services was not successful.
CELL-01531: Unable to parse the cellinit.ora file due to incorrect parameters in the file.
[celladmin@stocell1 ~]$
Note : My setup environment is as below
a) VMware Pro 12
b) Oracle Linux 7
c) Installed cell-12.1.2.3.2_LINUX.X64_160721-1.x86_64.rpm
d) Installed jdk1.8.0_66-1.8.0_66-fcs.x86_64.rpm
cellinit.ora is 0 bytes.
[celladmin@stocell1 config]$ cat /opt/oracle/cell12.1.2.3.2_LINUX.X64_160721/cellsrv/deploy/config/cellinit.ora
[celladmin@stocell1 config]$ ls -l /opt/oracle/cell12.1.2.3.2_LINUX.X64_160721/cellsrv/deploy/config/cellinit.ora
-rw-r–r–. 1 celladmin root 0 Jan 11 12:06 /opt/oracle/cell12.1.2.3.2_LINUX.X64_160721/cellsrv/deploy/config/cellinit.ora
[celladmin@stocell1 config]$ cellcli -e create cell stocell1 interconnect1=eno33554984
CELL-01514: Connect Error. Verify that Management Server is listening at the specified HTTP port: 8888.
[root@stocell1 ]# celld status
rsStatus: running
msStatus: stopped
cellsrvStatus: stopped
Please advice
I was able to proceed 1 step further by adding PATH to lib folder. Now I am stuck at next command
[celladmin@stocell1 ~]$ cellcli -e alter cell restart services all
Stopping the RS, CELLSRV, and MS services…
The SHUTDOWN of services was successful.
Starting the RS, CELLSRV, and MS services…
Getting the state of RS services… running
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful.
CELL-01531: Unable to parse the cellinit.ora file due to incorrect parameters in the file.
Starting MS services…
The STARTUP of MS services was successful.
[celladmin@stocell1 ~]$ cellcli -e create cell stocell1 interconnect1=eth1
CELL-02598: Ipaddress/Netmask attribute is not properly configured for interconnect eth1.
———————————————–
[celladmin@stocell1 ~]$ cat /etc/sysconfig/network-scripts/ifcfg-eth1
TYPE=Ethernet
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=no
NAME=eth1
UUID=808d6f42-ab25-4017-b9c4-0a52dc9a42db
DEVICE=eth1
ONBOOT=yes
DNS1=192.168.116.2
IPADDR=192.168.116.161
GATEWAY=192.168.116.2
—————————————————
[celladmin@stocell1 ~]$ netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 192.168.116.2 0.0.0.0 UG 0 0 0 eth0
0.0.0.0 192.168.116.2 0.0.0.0 UG 0 0 0 eth0
0.0.0.0 192.168.116.2 0.0.0.0 UG 0 0 0 eth1
192.168.116.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.116.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0
—————————————————
Please advice
Hi, I deleted the simulator just after the exam and I cannot check on “my” configuration.
Try setting explicitly NETMASK=255.255.255.0 in eth1 and also post your /etc/hosts
Ciao
Thanks for suggestion.
By setting netmask explicitly, I was able to move again 1 step forward but still CELLSRV could not start.
CellCLI> alter cell restart services all
Stopping the RS, CELLSRV, and MS services…
The SHUTDOWN of services was successful.
Starting the RS, CELLSRV, and MS services…
Getting the state of RS services… running
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful.
CELL-01547: CELLSRV startup failed due to unknown reasons.
Starting MS services…
The STARTUP of MS services was successful.
CellCLI> exit
quitting
[celladmin@stocell1 ~]$ cellcli -e create cell stocell1 interconnect1=eth1
Cell stocell1 successfully created
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful.
CELL-01547: CELLSRV startup failed due to unknown reasons.
=======================================================================================
I see now cellinit.ora also got populated with below entry
cat /opt/oracle/cell/cellsrv/deploy/config/cellinit.ora
#CELL Initialization Parameters
ipaddress1=192.168.116.161/24
=======================================================================================
[root@stocell1 trace]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.116.151 stocell1
192.168.116.161 stocell1-ib
=======================================================================================
[root@stocell1 trace]# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 192.168.116.2 0.0.0.0 UG 0 0 0 eth1
0.0.0.0 192.168.116.2 0.0.0.0 UG 0 0 0 eth1
0.0.0.0 192.168.116.2 0.0.0.0 UG 0 0 0 eth0
192.168.116.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
192.168.116.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0
=======================================================================================
[root@stocell1 trace]# ifconfig -a
eth0: flags=4163 mtu 1500
inet 192.168.116.151 netmask 255.255.255.0 broadcast 192.168.116.255
inet6 fe80::20c:29ff:fe07:2c3c prefixlen 64 scopeid 0x20
ether 00:0c:29:07:2c:3c txqueuelen 1000 (Ethernet)
RX packets 67346 bytes 4076442 (3.8 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 16 bytes 1128 (1.1 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth1: flags=4163 mtu 1500
inet 192.168.116.161 netmask 255.255.255.0 broadcast 192.168.116.255
inet6 fe80::20c:29ff:fe07:2c46 prefixlen 64 scopeid 0x20
ether 00:0c:29:07:2c:46 txqueuelen 1000 (Ethernet)
RX packets 68830 bytes 4224925 (4.0 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1802 bytes 1386484 (1.3 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
lo: flags=73 mtu 65536
……
…..
…..
=======================================================================================
[root@stocell1 trace]# hostname
stocell1
[root@stocell1 trace]# ping 192.168.116.151
PING 192.168.116.151 (192.168.116.151) 56(84) bytes of data.
64 bytes from 192.168.116.151: icmp_seq=1 ttl=64 time=0.079 ms
64 bytes from 192.168.116.151: icmp_seq=2 ttl=64 time=0.030 ms
^C
— 192.168.116.151 ping statistics —
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.030/0.054/0.079/0.025 ms
[root@stocell1 trace]# ping 192.168.116.161
PING 192.168.116.161 (192.168.116.161) 56(84) bytes of data.
64 bytes from 192.168.116.161: icmp_seq=1 ttl=64 time=0.029 ms
64 bytes from 192.168.116.161: icmp_seq=2 ttl=64 time=0.053 ms
64 bytes from 192.168.116.161: icmp_seq=3 ttl=64 time=0.055 ms
^C
— 192.168.116.161 ping statistics —
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.029/0.045/0.055/0.014 ms
=======================================================================================
Please advice.
Latest Errors :
[celladmin@stocell1 ~]$ cellcli -e create cell stocell1 interconnect1=eth1
Cell stocell1 successfully created
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful.
CELL-01547: CELLSRV startup failed due to unknown reasons.
Alert Log
————
CELL process id=12832
CELL host name=stocell1
CELL version=12.1.2.3.2,label=OSS_12.1.2.3.2_LINUX.X64_160721,Thu_Jul_21_09:50:44_PDT_2016
CELLSRV version md5: 150f7acd0b05d50095223fb9399e36ee
OS Stats: Physical memory: 5792 MB. Num cores: 1
CELLSRV configuration parameters:
Memory reserved for cellsrv: 2892 MB Memory for other processes: 2900 MB
Running on simulated hardware in production environment
Successfully allocated 256 MB for Storage Index. Storage Index memory usage can grow up to a maximum of 289 MB.
CELL communication is configured to use 1 interface(s):
192.168.116.161
IPC version: Oracle UDP/IP (generic)
IPC Vendor 1 Protocol 2
Version 4.1
MS_ALERT HUGEPAGE CLEAR
ossmmap_map: mmap failed for SparseV2PhysMap len: 12800 as there is insufficient memory
Dumping oal memory statistics (all values in bytes)
cellsrv: total os mem: 6012474792 sga osmem: 1375731712 pga osmem: 1086888
cellsrv: sga alloc mem: 1145246520 pga alloc mem: 510120
group: total os mem: 0 ocl: 3145728
Memtype: sga: cellsrv os mem 1375731712 all group os mem 0
Memtype: pga: cellsrv os mem 1086888 all group os mem 0
Memtype: cache: cellsrv os mem 3962249216 all group os mem 0
Memtype: storidx: cellsrv os mem 289431552 all group os mem 0
Memtype: heapsummary: cellsrv os mem 18022400 all group os mem 0
Memtype: codetext: cellsrv os mem 78643200 all group os mem 0
Memtype: malloc: cellsrv os mem 33554432 all group os mem 0
Memtype: stack: cellsrv os mem 253755392 all group os mem 0
Thu Jan 12 13:12:54 2017
[RS] monitoring process /opt/oracle/cell12.1.2.3.2_LINUX.X64_160721/cellsrv/bin/cellrsomt (pid: 12830) returned with error: 161
Errors in file /opt/oracle/cell12.1.2.3.2_LINUX.X64_160721/log/diag/asm/cell/stocell1/trace/svtrc_12832_main.trc (incident=321):
ORA-00600: internal error code, arguments: [TODO(zutao): handle OOM gracefully], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/cell12.1.2.3.2_LINUX.X64_160721/log/diag/asm/cell/stocell1/incident/incdir_321/svtrc_12832_main_i321.trc
Sweep [inc][321]: completed
CELLSRV error – ORA-600 internal error
Thu Jan 12 13:12:55 2017
CELLSRV is no longer alive before state dump completes.
Thu Jan 12 13:12:55 2017
[RS] Stopped Service CELLSRV
Looks like memory related error. Not sure where to adjust.
Thanks I will able to proceed by increasing RAM of my machine. Thanks for all your help.
Hi, Thanks a lot for sharing the info. I am able to create cell storage successfully, I want to know do i need to install exadata software in db node also. I configure 2 db node and 3 cell node, while running root.sh in node1 its failed and not able to
Adding Clusterware entries to inittab
CRS-2672: Attempting to start ‘ora.mdnsd’ on ‘qr01db01’
CRS-2676: Start of ‘ora.mdnsd’ on ‘qr01db01’ succeeded
CRS-2672: Attempting to start ‘ora.gpnpd’ on ‘qr01db01’
CRS-2676: Start of ‘ora.gpnpd’ on ‘qr01db01’ succeeded
CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘qr01db01’
CRS-2672: Attempting to start ‘ora.gipcd’ on ‘qr01db01’
CRS-2676: Start of ‘ora.cssdmonitor’ on ‘qr01db01’ succeeded
CRS-2676: Start of ‘ora.gipcd’ on ‘qr01db01’ succeeded
CRS-2672: Attempting to start ‘ora.cssd’ on ‘qr01db01’
CRS-2672: Attempting to start ‘ora.diskmon’ on ‘qr01db01’
CRS-2676: Start of ‘ora.diskmon’ on ‘qr01db01’ succeeded
CRS-2674: Start of ‘ora.cssd’ on ‘qr01db01’ failed
CRS-2679: Attempting to clean ‘ora.cssd’ on ‘qr01db01’
CRS-2681: Clean of ‘ora.cssd’ on ‘qr01db01’ succeeded
CRS-2673: Attempting to stop ‘ora.gipcd’ on ‘qr01db01’
CRS-2677: Stop of ‘ora.gipcd’ on ‘qr01db01’ succeeded
CRS-2673: Attempting to stop ‘ora.cssdmonitor’ on ‘qr01db01’
CRS-2677: Stop of ‘ora.cssdmonitor’ on ‘qr01db01’ succeeded
CRS-2673: Attempting to stop ‘ora.gpnpd’ on ‘qr01db01’
CRS-2677: Stop of ‘ora.gpnpd’ on ‘qr01db01’ succeeded
CRS-2673: Attempting to stop ‘ora.mdnsd’ on ‘qr01db01’
CRS-2677: Stop of ‘ora.mdnsd’ on ‘qr01db01’ succeeded
CRS-4000: Command Start failed, or completed with errors.
CSS startup failed with return code 1
The exlusive mode cluster start failed, see Grid Infrastructure alert log for more information
Initial cluster configuration failed. See /oraeng/GI/cfgtoollogs/crsconfig/rootcrs_qr01db01.log for details
/oraeng/GI/perl/bin/perl -I/oraeng/GI/perl/lib -I/oraeng/GI/crs/install /oraeng/GI/crs/install/rootcrs.pl execution failed
[root@qr01db01 ~]# ifconfig
in ocsd logfile showing error.
71: [ CSSD][1080420672]clssgmDeadProc: proc 0x2a81e50
2017-10-07 09:53:52.185: [ CSSD][1080420672]clssgmDestroyProc: cleaning up proc(0x2a81e50) con(0x1b0) skgpid ospid 10755 with 0 clients, refcount 0
2017-10-07 09:53:52.186: [ CSSD][1080420672]clssgmDiscEndpcl: gipcDestroy 0x1b0
2017-10-07 09:53:52.226: [ CSSD][1080420672]clssscSelect: cookie accept request 0x26a2c10
2017-10-07 09:53:52.226: [ CSSD][1080420672]clssgmAllocProc: (0x2a81ef0) allocated
2017-10-07 09:53:52.226: [ CSSD][1080420672]clssgmClientConnectMsg: properties of cmProc 0x2a81ef0 – 1,2,3,4,5
2017-10-07 09:53:52.226: [ CSSD][1080420672]clssgmClientConnectMsg: Connect from con(0x200) proc(0x2a81ef0) pid(10755) version 11:2:1:4, properties: 1,2,3,4,5
2017-10-07 09:53:52.226: [ CSSD][1080420672]clssgmClientConnectMsg: msg flags 0x0000
2017-10-07 09:53:54.257: [ SKGFD][1099741504]ERROR: -8(OS Error 1 (bind_fail,skgxpvifconf,requested interface 10.0.0.50 failed bind. Check output from ifconfig command,Error 0)
)
2017-10-07 09:53:54.257: [ SKGFD][1099741504]ERROR: -10(OSS Operation oss_initialize failed with error 4 [Network initialization failed]
)
2017-10-07 09:53:54.258: [ CSSD][1099741504]clsssnmvDDiscThread: Unable to create clsf context
2017-10-07 09:53:54.258: [ CSSD][1099741504]###################################
2017-10-07 09:53:54.258: [ CSSD][1099741504]clssscExit: CSSD aborting from thread clssnmvDDiscThread
2017-10-07 09:53:54.258: [ CSSD][1099741504]###################################
2017-10-07 09:53:54.258: [ CSSD][1099741504](:CSSSC00012:)clssscExit: A fatal error occurred and the CSS daemon is terminating abnormally
2017-10-07 09:53:54.258: [ CSSD][1099741504]
Though i checked eth interface is up and running and able to ping from both node
Please help me on this.
Thanks a lot for this post….
Buy game accounts cheap