Start and use storage cell software to configure cell disks, flash disks, grid disks

Posted on 11/12/2013 by dbaesp

To start the software (correctly) and use it to configure cell/grid/flash disks the number of open file descriptors must be increased (a very clear error will be raised otherwise).

As it found googling edit the /etc/sysctl.ctl and add/set

fs.file-max = 65536

and edit the /etc/security/limit.conf files and add/set

* soft nofile 65536
* hard nofile 65536

To communicate over InfiniBand Oracle uses rds protocol. All the modules of rds must be loaded (and configured to be loaded over machine restarts)

[root@stocell1 ~]# modprobe rds
[root@stocell1 ~]# modprobe rds_tcp
[root@stocell1 ~]# modprobe rds_rdma

and permanently editing/creating rds.conf

[root@stocell1 ~]# vi /etc/modprobe.d/rds.conf

insert line

install rds /sbin/modprobe –ignore-install rds && /sbin/modprobe rds_tcp && /sbin/modprobe rds_rdma

[EDIT: pay attention to double “–” that sometimes became a single char “–”]
Now, as celladmin user the storage cell software could be started

[root@stocell1 ~]# su – celladmin
[celladmin@stocell1 ~]$ cellcli -e alter cell restart services allStopping the RS, CELLSRV, and MS services…
CELL-01509: Restart Server (RS) not responding.
Starting the RS, CELLSRV, and MS services…
Getting the state of RS services… running
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful.
CELL-01547: CELLSRV startup failed due to unknown reasons.
Starting MS services…
The STARTUP of MS services was successful.

The error in not unknown (as stated) but well known and expected

[Required IP parameters missing]

🙂

So, set the interconnect (= InfiniBand connection) and …

[celladmin@stocell1 ~]$ cellcli -e create cell stocell1 interconnect1=eth1
Cell stocell1 successfully created
Starting CELLSRV services…
The STARTUP of CELLSRV services was successful.
Flash cell disks, FlashCache, and FlashLog will be created…
CellDisk FD_00_stocell1 successfully created
CellDisk FD_01_stocell1 successfully created
CellDisk FD_02_stocell1 successfully created
CellDisk FD_03_stocell1 successfully created
CellDisk FD_04_stocell1 successfully created
CellDisk FD_05_stocell1 successfully created
Flash log stocell1_FLASHLOG successfully created
Flash cache stocell1_FLASHCACHE successfully created

(I’m not sure why Flash components are auto configured, but it can be modified later if needed)
Configure cell disks

[celladmin@stocell1 ~]$ cellcli -e create celldisk all
CellDisk CD_DISK01_stocell1 successfully created
CellDisk CD_DISK02_stocell1 successfully created
CellDisk CD_DISK03_stocell1 successfully created
CellDisk CD_DISK04_stocell1 successfully created
CellDisk CD_DISK05_stocell1 successfully created
CellDisk CD_DISK06_stocell1 successfully created
CellDisk CD_DISK07_stocell1 successfully created
CellDisk CD_DISK08_stocell1 successfully created
CellDisk CD_DISK09_stocell1 successfully created
CellDisk CD_DISK10_stocell1 successfully created
CellDisk CD_DISK11_stocell1 successfully created
CellDisk CD_DISK12_stocell1 successfully created

and grid disks

[celladmin@stocell1 ~]$ cellcli -e create griddisk all harddisk prefix=DATA
GridDisk DATA_CD_DISK01_stocell1 successfully created
GridDisk DATA_CD_DISK02_stocell1 successfully created
GridDisk DATA_CD_DISK03_stocell1 successfully created
GridDisk DATA_CD_DISK04_stocell1 successfully created
GridDisk DATA_CD_DISK05_stocell1 successfully created
GridDisk DATA_CD_DISK06_stocell1 successfully created
GridDisk DATA_CD_DISK07_stocell1 successfully created
GridDisk DATA_CD_DISK08_stocell1 successfully created
GridDisk DATA_CD_DISK09_stocell1 successfully created
GridDisk DATA_CD_DISK10_stocell1 successfully created
GridDisk DATA_CD_DISK11_stocell1 successfully created
GridDisk DATA_CD_DISK12_stocell1 successfully created

Work done!

Thanks to Steven Lee. His post on the same issue gave me the solution for 2 problems:
– the needed path /var/log/oracle found using the method suggested by Lee
– a strange memory problem with 4GB RAM that was solved simply resizing RAM to 2GB (according with some answers of Lee)
Furthermore the differences between my VM ad his VM helped me to fix the problem related to rds modules (that probably evolved in the meantime)

/*+ esp */

60 thoughts on “Start and use storage cell software to configure cell disks, flash disks, grid disks”

Pingback: Prepare some virtual disks (files!) « Dba Esp
Pingback: Exadata Virtual Test Environment for OCE Prep – storage cell node | Dba Esp
Vinay Singh says:

26/07/2014 at 13:56

Hi ,
when I created the cell,it didn’t created the flash components .Also,creating celldisk took all the disks which seems logical as those are similar disks .Can you tell me how your flashdisks were created.?

[celladmin@stocell1 ~]$ cellcli -e create cell stocell1 interconnect1=eth1
Cell stocell1 successfully created
Starting CELLSRV services…
The STARTUP of CELLSRV services was successful.
[celladmin@stocell1 ~]$ cellcli
CellCLI> create celldisk all
CellDisk CD_disk1_stocell1 successfully created
CellDisk CD_disk10_stocell1 successfully created
CellDisk CD_disk11_stocell1 successfully created
CellDisk CD_disk12_stocell1 successfully created
CellDisk CD_disk2_stocell1 successfully created
CellDisk CD_disk3_stocell1 successfully created
CellDisk CD_disk4_stocell1 successfully created
CellDisk CD_disk5_stocell1 successfully created
CellDisk CD_disk6_stocell1 successfully created
CellDisk CD_disk7_stocell1 successfully created
CellDisk CD_disk8_stocell1 successfully created
CellDisk CD_disk9_stocell1 successfully created
CellDisk CD_flash1_stocell1 successfully created
CellDisk CD_flash2_stocell1 successfully created
CellDisk CD_flash3_stocell1 successfully created
CellDisk CD_flash4_stocell1 successfully created
CellDisk CD_flash5_stocell1 successfully created
CellDisk CD_flash6_stocell1 successfully created

CellCLI>

Reply
- dbaesp says:
  
  26/07/2014 at 14:48
  
  In my env create cell command automatically create all flash stuff. I cannot check but I suppose that there is a very complex algorithm to recognize flash disks from cell/disks/raw links: FLASH uppercase 😉 in link name.
  I’m not sure but you can simply check that.
  Drop cell, recreate links, recreate cell, eventually create flash stuff manually and than create celldisks
  
  Ciao
  
  Reply
Vinay Singh says:

27/07/2014 at 07:21

The idea crossed my mind but i thought it was too stupid to even try..but now the CAPS have worked…Hail Oracle !

Reply
Raymond says:

30/07/2014 at 10:48

Hi,

When i try to create cell with interconnect1=eth1, CELLSRV service cannot startup with error

CellCLI> create cell exapoc interconnect1=eth1
Cell exapoc successfully created
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful.
CELL-01547: CELLSRV startup failed due to unknown reasons.

Wed Jul 30 02:39:50 2014 787 msec State dump completed for CELLSRV
Errors in file /opt/oracle/cell11.2.3.3.1_LINUX.X64_140529.1/log/diag/asm/cell/exapoc/trace/svtrc_6248_0.trc (incident=113):
ORA-00600: internal error code, arguments: [LinuxBlockIO::init], [], [], [], [], [], [], [], [], [], [], []
Incident details in: /opt/oracle/cell11.2.3.3.1_LINUX.X64_140529.1/log/diag/asm/cell/exapoc/incident/incdir_113/svtrc_6248_0_i113.trc
Wed Jul 30 02:39:55 2014
State dump interrupted for Cellsrv by RS. It did not complete in 5 seconds.
[RS] monitoring process /opt/oracle/cell11.2.3.3.1_LINUX.X64_140529.1/cellsrv/bin/cellrsomt (pid: 6247) returned with error: 124
[RS] Could not start Service CELLSRV correctly. Try stopping
[RS] Stopped Service CELLSRV

Regards,

Reply
- dbaesp says:
  
  30/07/2014 at 19:21
  
  Hi,
  put some info from trace file so we can try to understand where is the problem !
  
  esp
  
  Reply
- Marketing Consultant says:
  
  18/12/2018 at 01:18
  
  I can’t stop myself from reading your blogs. I love reading your posts. Spot on with this posting, as you always are. Have they tested your theory? I seriously appreciate individuals like you. This information you are providing is awesome.
  
  Reply
Raymond says:

31/07/2014 at 07:49

hi,

some line from the trace file.

///////////////////
2014-07-30 22:57:18.363997 :00000005: CELLSRV needs 475 hugepages, but there are only 12 available. 2014-07-30 22:57:18.364042 :00000006: CELLSRV trying to reserve 463 more hugepages.
2014-07-30 22:57:18.366826 :00000007: Successfully allocated 902MB of hugepages for buffersLockPool name:FastFileInit::shared_pin_locks type:MUTEX POOL group:147 numLocks:256 nextLockIndex:0 totalLockRefs:0 lockArray:0x7f55dd576590
LockPool name:FastFileInit::in_mem_MD_locks type:RWLOCK POOL group:34 numLocks:256 nextLockIndex:0 totalLockRefs:0 lockArray:0x7f55dd735e60
Writing message type OSS_PIPE_ERR_FAILED_STARTUP_RESTART to OSS->RS pipe
2014-07-30 22:57:20.267087 :00000009: Master: SIGUSR2 delivered by pid – 4219, uid – 0. Dumping call stack, process map, system state, and trace buffers
Writing message type OSS_PIPE_ERR_FAILED_STARTUP_RESTART to OSS->RS pipe
skgznp_write to RSOMT failed, retval=56822
slos 0x7ffffbf891c0 Error Category: 56824
Operation: send
Location: skgznpwm2
DepInfo: Broken pipe
Error Code: 32DDE: Flood control is not active
Incident 129 created, dump file: /opt/oracle/cell11.2.3.3.1_LINUX.X64_140529.1/log/diag/asm/cell/exapoc/incident/incdir_129/svtrc_4231_0_i129.trc
ORA-00600: internal error code, arguments: [LinuxBlockIO::init], [], [], [], [], [], [], [], [], [], [], []

Writing message type OSS_PIPE_ERR_FAILED_STARTUP_RESTART to OSS->RS pipe
skgznp_write to RSOMT failed, retval=56822
slos 0x7ffffbf95200 Error Category: 56824
Operation: send
Location: skgznpwm2
DepInfo: Broken pipe
Error Code: 322014-07-30 22:57:21.679204 :0000000E: CELLSRV error – ORA-600 internal error
/////////////////

Reply
- dbaesp says:
  
  31/07/2014 at 10:47
  
  ciao,
  sorry, but from that info I could guess nothing.
  Please put some other info as output of
  uname -a
  more /etc/hosts
  lsmod | grep rds
  and detail of VM (memory, …)
  
  Reply
  - Raymond says:
    
    05/08/2014 at 03:38
    
    Hi,
    
    [celladmin@localhost ~]$ uname -a
    Linux localhost.localdomain 2.6.18-371.el5 #1 SMP Mon Sep 30 16:34:30 PDT 2013 x86_64 x86_64 x86_64 GNU/Linux
    [celladmin@localhost ~]$ more /etc/hosts
    # Do not remove the following line, or various programs
    # that require network functionality will fail.
    127.0.0.1 localhost.localdomain localhost
    ::1 localhost6.localdomain6 localhost6
    [celladmin@localhost ~]$ lsmod|grep rds
    rds_rdma 106305 0
    rds_tcp 48097 0
    rds 154921 8 rds_rdma,rds_tcp
    rdma_cm 73301 2 rds_rdma,ib_iser
    ib_core 107841 7 rds_rdma,ib_iser,rdma_cm,ib_cm,iw_cm,ib_sa,ib_mad
    
    VM info:
    2048MB Memory
    
    IDE Controller
    
    IDE Secondary Master (CD/DVD):Empty
    SATA Controller
    SATA Port 0:stocell1.vmdk (Normal, 25.00 GB)
    SATA Port 1:stocell1_DISK01.vmdk (Normal, 500.00 MB)
    SATA Port 2:stocell1_DISK02.vmdk (Normal, 500.00 MB)
    SATA Port 3:stocell1_DISK03.vmdk (Normal, 500.00 MB)
    SATA Port 4:stocell1_DISK04.vmdk (Normal, 500.00 MB)
    SATA Port 5:stocell1_DISK05.vmdk (Normal, 500.00 MB)
    SATA Port 6:stocell1_DISK06.vmdk (Normal, 500.00 MB)
    SATA Port 7:stocell1_DISK07.vmdk (Normal, 500.00 MB)
    SATA Port 8:stocell1_DISK08.vmdk (Normal, 500.00 MB)
    SATA Port 9:stocell1_DISK09.vmdk (Normal, 500.00 MB)
    SATA Port 10:stocell1_DISK10.vmdk (Normal, 500.00 MB)
    SATA Port 11:stocell1_DISK11.vmdk (Normal, 500.00 MB)
    SATA Port 12:stocell1_DISK12.vmdk (Normal, 500.00 MB)
    SATA Port 13:stocell1_FLASH01.vmdk (Normal, 400.00 MB)
    SATA Port 14:stocell1_FLASH02.vmdk (Normal, 400.00 MB)
    SATA Port 15:stocell1_FLASH03.vmdk (Normal, 400.00 MB)
    SATA Port 16:stocell1_FLASH04.vmdk (Normal, 400.00 MB)
    SATA Port 17:stocell1_FLASH05.vmdk (Normal, 400.00 MB)
    SATA Port 18:stocell1_FLASH06.vmdk (Normal, 400.00 MB)
dbaesp says:

07/08/2014 at 14:19

All seems to be well done. You should debug at deeper level or try rebuilding the cell.
try first rebuilding disks, links, prepare disks with dd, verify permissions.
If you want to play/investigate, debug at deeper level: try with “strace cellcli -e create cell exapoc interconnect1=eth1”.
If you will not have new interesting details, cellcli itself is a script running a java machine that will execute your command.
So “strace” at deeper level (directly java vm+command).

Reply
- Raymond says:
  
  08/08/2014 at 10:42
  
  Dear,
  
  May I know hwo to prepare disk with dd? do i need to create partition for each cell disk such as /dev/sdb1 ??
  
  Reply
  - dbaesp says:
    
    08/08/2014 at 10:59
    
    You must create disks in VM as virtual disks, you must link in a specific path, just to be sure do a dd if=/dev/zero of=yourcelldiskhere count =1000
    Pay attention to avoid dd-ing your system disk. All is in the post…
- Ousseini Oumarou, Senior Oracle DBA says:
  
  24/12/2014 at 17:04
  
  Dear all,
  
  I am trying to setup a exadata virtuallab according to this nice link and i get stuck at this point:
  
  [celladmin@exqrcel01 ~]$ cellcli -e create cell exqrcel01 interconnect1=eth1
  
  CELL-02625: Interface eth1 refers to device name .
  Device name must be same as Interface name.
  [celladmin@exqrcel01 ~]$
  
  [root@exqrcel01 config]# ifconfig -a
  eth0 Link encap:Ethernet HWaddr 08:00:27:F4:14:28
  inet addr:192.168.2.19 Bcast:192.168.2.255 Mask:255.255.255.0 –> My public IP
  ……
  
  eth1 Link encap:Ethernet HWaddr 08:00:27:E0:CA:FF —> should be my infiband
  inet addr:192.168.56.6 Bcast:192.168.56.255 Mask:255.255.255.0
  ……
  
  Any idea, how could i get forward?
  
  Thanks and merry Xmas
  Ousseini Oumarou
  
  Reply
  - dbaesp says:
    
    26/12/2014 at 00:16
    
    First of all … Merry Christmas !!
    Then if you already found the problem and the fix, please post for future help.
    Otherwise you should check ifcfg* files in /etc/sysconfig/network-scripts/
    because the error seems to be related to network configuration
  - Ousseini Oumarou, Senior Oracle DBA says:
    
    26/12/2014 at 12:03
    
    Hello,
    I could configured the cellserver successfully and kfod on the dbserver also return the expected disks. However, the grid installation failed now the 2nd time with the ORA-15080.
    
    Although, i have selected 3 disks (ASM_OCR normal redundancy), I can see ORA-15063 (insufficient number of disks) in asmca Logfile??.
    Strange behavior also on the storage server (exqrcel01): CELLSRV went down and the CELLCLI ALERTHISTORY listed an ORA-0600 Error.
    
    This is really a cumbersome task. I do not know if this could be related with my environment settings:
    OEL 6.6, CellServer software: cell-11.2.3.3.1_LINUX.X64_140708-1.x86_64.
    Virtualbox running on external Hard Disk.
    
    [celladmin@exqrcel01 ~]$ uname -na
    Linux exqrcel01 2.6.32-504.el6.x86_64 #1 SMP Tue Oct 14 01:47:47 PDT 2014 x86_64 x86_64 x86_64 GNU/Linux
    [celladmin@exqrcel01 ~]$
    [celladmin@exqrcel01 ~]$ rpm -qa | grep -i cell
    cell-11.2.3.3.1_LINUX.X64_140708-1.x86_64
    [celladmin@exqrcel01 ~]$
    
    [root@exqrdb01 Desktop]# /software/grid11204_unzip/stage/ext/bin/kfod disks=all op=disks
    WARNING: Using brute force method to determine the size of /dev/raw/rawctl.
    There will be performance issues. Please check configuration to determine the cause for the failure of ioctl
    ——————————————————————————–
    Disk Size Path User Group
    ================================================================================
    1: 1024 Mb o/192.168.56.6/DATA_CD_DISK01_exqrcel01
    2: 1024 Mb o/192.168.56.6/DATA_CD_DISK02_exqrcel01
    3: 1024 Mb o/192.168.56.6/DATA_CD_DISK03_exqrcel01
    4: 1024 Mb o/192.168.56.6/DATA_CD_DISK04_exqrcel01
    …….
    
    [root@exqrdb01 Desktop]# /u01/app/11.2.0/grid/root.sh
    Performing root user operation for Oracle 11g
    
    The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME= /u01/app/11.2.0/grid
    
    ………
    Installing Trace File Analyzer
    CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘exqrdb01’
    CRS-2676: Start of ‘ora.cssdmonitor’ on ‘exqrdb01’ succeeded
    CRS-2672: Attempting to start ‘ora.cssd’ on ‘exqrdb01’
    CRS-2672: Attempting to start ‘ora.diskmon’ on ‘exqrdb01’
    CRS-2676: Start of ‘ora.diskmon’ on ‘exqrdb01’ succeeded
    CRS-2676: Start of ‘ora.cssd’ on ‘exqrdb01’ succeeded
    
    Disk Group ASM_OCR creation failed with the following message:
    ORA-15018: diskgroup cannot be created
    ORA-15080: synchronous I/O operation to a disk failed
    
    Configuration of ASM … failed
    see asmca logs at /u01/app/oracle/cfgtoollogs/asmca for details
    ……
    
    tail -100 /u01/app/oracle/cfgtoollogs/asmca/asmca-141226AM124315.log
    
    [main] [ 2014-12-26 00:44:10.108 CET ] [USMInstance.configureLocalASM:3041] ORA-15032: not all alterations performed
    ORA-15017: diskgroup “ASM_OCR” cannot be mounted
    ORA-15063: ASM discovered an insufficient number of disks for diskgroup “ASM_OCR”
    
    CellCLI> list cell detail
    name: exqrcel01
    ….
    cellsrvStatus: stopped
    msStatus: running
    rsStatus: running
    
    CellCLI>
    
    ——————-
    CELLCLI –> Alerthistory
    
    CellCLI> list alerthistory
    ….
    35 2014-12-25T23:44:38+01:00 critical “RS-7445 [Serv MS is absent] [It will be restarted] [] [] [] [] [] [] [] [] [] []”
    36 2014-12-26T00:44:18+01:00 critical “ORA-00600: internal error code, arguments: [StorageIdx::getOclSIRegion], [], [], [], [], [], [], [], [], [], [], []”
    
    CellCLI>
  - dbaesp says:
    
    27/12/2014 at 11:24
    
    I had problem with rds on 6.5. The same for some blog’s readers. I suggest to use oel 5 with el kernel.
  - Ousseini Oumarou, Senior Oracle DBA says:
    
    26/12/2014 at 17:44
    
    Hello,
    
    the error CELL-02625: Interface eth1 refers to device name was the missing entry DEVICE=eth1 in
    /etc/sysconfig/network-scripts/ifcfg-eth1.
    I do not know why it was missing, probally cause is the clone. Then I cloned the cellserver from an existing virtualbox. After adding the entry, the configuration run successfully.
    
    I hope this information could help.
    
    Regards
  - dbaesp says:
    
    27/12/2014 at 11:22
    
    Perfect. Thanks for sharing.
  - TAPAS KUMAR KARMAKAR says:
    
    22/08/2019 at 17:34
    
    Hi, Ousseini Oumarou
    
    I also got this error. This error relates to network related error.
    The main impact of this error resides in /etc/sysconfig/network-scripts/ifcfg-xxx network file information.
    The Exadata binaries reads the data from this file and if found some unusual then throws an error.
    
    This can be resolved by changing/updating the information in the file.
    1. In some case it may require to change the “NAME” field that file that is differs from device name.
    2. If some cases “DEVICE” keyword is missing in the file. Require to update in the file.
    3. There may be a miss match in Mac address of of the ethernet card.
    
    In my case the issue is resolved by adding the “DEVICE=eth2” value in the file as it was not present in OEL 6.10 network file information.
Kalmi says:

11/09/2014 at 17:29

Dear Raymond

sysctl -w fs.aio-max-nr=50000000

and also put into /etc/sysctl.conf

will solve your problem.

Reply
Greg Y. says:

26/10/2014 at 19:08

Hi
I am unable to create cell with error connecting to MS. It complains about the port 8888, but the port is listening. Any ideas and suggestions?

[celladmin@stocell1 ~]$ cellcli -e alter cell restart services all

Stopping the RS, CELLSRV, and MS services…
The SHUTDOWN of services was successful.
Starting the RS, CELLSRV, and MS services…
Getting the state of RS services… running
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful.
CELL-01547: CELLSRV startup failed due to unknown reasons.
Starting MS services…
The STARTUP of MS services was successful.

[celladmin@stocell1 ~]$ cellcli -e create cell stocell1 interconnect1=eth1

CELL-01514: Connect Error. Verify that Management Server is listening at the specified HTTP port: 8888.
[celladmin@stocell1 ~]$

Below are my environment and info from the log:

[root@stocell1 modprobe.d]# lsmod |grep rds
rds_rdma 80877 0
rdma_cm 36834 1 rds_rdma
ib_core 74355 6 rds_rdma,rdma_cm,ib_cm,iw_cm,ib_sa,ib_mad
rds_tcp 10293 0
rds 96610 2 rds_rdma,rds_tcp

[root@stocell1 modprobe.d]# netstat -anp|grep 8888
tcp 0 0 127.0.0.1:34027 127.0.0.1:8888 TIME_WAIT –
tcp 0 0 127.0.0.1:34032 127.0.0.1:8888 TIME_WAIT –
tcp 0 0 ::ffff:127.0.0.1:8888 :::* LISTEN 6540/java

[root@stocell1 modprobe.d]# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
127.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
0.0.0.0 192.168.1.1 0.0.0.0 UG 0 0 0 eth0

Below are from ms-odl.log

[root@stocell1 modprobe.d]#
[2014-10-26T11:26:53.828-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSCoreImpl] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] lunstat: normal changeStat: found lunname: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/stocell1_DISK01
[2014-10-26T11:26:53.828-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSCoreImpl] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] In lunFound: LUN /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/stocell1_DISK01, os devicename: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/stocell1_DISK01
[2014-10-26T11:26:54.007-05:00] [ossmgmt] [WARNING] [] [ms.core.MSCoreImpl] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] Tuning Block IO failed on device: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/disks/raw/stocell1_DISK01
[2014-10-26T11:26:54.008-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] msosscomm ctx not valid, trying to init
[2014-10-26T11:26:54.037-05:00] [ossmgmt] [ERROR] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] Required IP parameters not configured. Err: 36
[2014-10-26T11:26:54.039-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] _ms_mprotect_corrupt_buf is set to TRUE
[2014-10-26T11:26:54.044-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSCoreImpl] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] Error while trying to sync diflist: oracle.ossmgmt.common.core.SageException: CELL-02627: There is a communication error between MS and CELLSRV. Configuration file cellinit.ora is malformed or does not include required information.[[
oracle.ossmgmt.common.core.SageException: CELL-02627: There is a communication error between MS and CELLSRV. Configuration file cellinit.ora is malformed or does not include required information.
at oracle.ossmgmt.ms.core.MSOSSComm.static_sendrecv(Native Method)
at oracle.ossmgmt.ms.core.MSOSSComm.isValidCellDisk(MSOSSComm.java:2989)
at oracle.ossmgmt.ms.core.MSCoreImpl.isValidCellDisk(MSCoreImpl.java:2043)
at oracle.ossmgmt.ms.core.MSCoreImpl.isValidSageLun(MSCoreImpl.java:2070)
at oracle.ossmgmt.ms.core.MSCoreImpl.lunFound(MSCoreImpl.java:2920)
at oracle.ossmgmt.ms.core.MSCoreImpl.getNewDiskAdpState(MSCoreImpl.java:8320)
at oracle.ossmgmt.ms.core.MSDiskPollTimerTask.run(MSDiskPollTimerTask.java:108)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
]]
[2014-10-26T11:30:13.997-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] msosscomm ctx not valid, trying to init
[2014-10-26T11:30:13.999-05:00] [ossmgmt] [ERROR] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] Required IP parameters not configured. Err: 36
[2014-10-26T11:30:14.000-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] _ms_mprotect_corrupt_buf is set to TRUE
[2014-10-26T11:30:14.008-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] msosscomm ctx not valid, trying to init
[2014-10-26T11:30:14.011-05:00] [ossmgmt] [ERROR] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] Required IP parameters not configured. Err: 36
[2014-10-26T11:30:14.012-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] _ms_mprotect_corrupt_buf is set to TRUE
[2014-10-26T11:30:33.442-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] msosscomm ctx not valid, trying to init
[2014-10-26T11:30:33.449-05:00] [ossmgmt] [ERROR] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] Required IP parameters not configured. Err: 36
[2014-10-26T11:30:33.450-05:00] [ossmgmt] [NOTIFICATION] [] [ms.core.MSOSSComm] [tid: 13] [ecid: 127.0.0.1:24313:1414340313430:3,0] _ms_mprotect_corrupt_buf is set to TRUE

[root@stocell1 trace]# uname -a
Linux stocell1 2.6.32-431.el6.x86_64 #1 SMP Wed Nov 20 23:56:07 PST 2013 x86_64 x86_64 x86_64 GNU/Linux

[root@stocell1 trace]# cat /etc/hosts
127.0.0.1 stocell1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

Below are from Alert log:

RS-7445 [Required IP parameters missing] [Check cellinit.ora] [] [] [] [] [] [] [] [] [] []
Incident details in: /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/log/diag/asm/cell/stocell1/incident/incdir_169/rstrc_6529_4_i169.trc
Sun Oct 26 12:17:47 2014
RSBK version=11.2.3.2.1,label=OSS_11.2.3.2.1_LINUX.X64_130109,Wed_Jan__9_06:09:48_PST_2013
[RS] Started Service RS_BACKUP with pid 6539
[RS] Kill previous monitoring process for core RS
Sun Oct 26 12:17:47 2014
[RS] Started monitoring process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrssmt with pid 6544
Sweep [inc][169]: completed
[RS] Required IP parameters not configured in cellinit.ora. Err: 36
OS Hugepage status:
Total/free hugepages available=12/12; hugepage size=2048KB
[RS] Start service CELLSRV failed with error: -74.
Sun Oct 26 12:17:47 2014
Could not connect to MS socket. Communication with MS may be degraded. errno=115
[RS] monitoring process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrsomt (pid: 0) returned with error: 124
[RS] Started Service MS with pid 6540

[root@stocell1 raw]# ls -l
total 0
lrwxrwxrwx 1 root root 8 Oct 25 22:26 stocell1_DISK01 -> /dev/sdb
lrwxrwxrwx 1 root root 8 Oct 25 22:27 stocell1_DISK02 -> /dev/sdc
lrwxrwxrwx 1 root root 8 Oct 25 22:27 stocell1_DISK03 -> /dev/sdd
lrwxrwxrwx 1 root root 8 Oct 25 22:27 stocell1_DISK04 -> /dev/sde
lrwxrwxrwx 1 root root 8 Oct 25 22:28 stocell1_DISK05 -> /dev/sdf
lrwxrwxrwx 1 root root 8 Oct 25 22:28 stocell1_DISK06 -> /dev/sdg
lrwxrwxrwx 1 root root 8 Oct 25 22:31 stocell1_DISK07 -> /dev/sdh
lrwxrwxrwx 1 root root 8 Oct 25 22:31 stocell1_DISK08 -> /dev/sdi
lrwxrwxrwx 1 root root 8 Oct 25 22:31 stocell1_DISK09 -> /dev/sdj
lrwxrwxrwx 1 root root 8 Oct 25 22:31 stocell1_DISK10 -> /dev/sdk
lrwxrwxrwx 1 root root 8 Oct 25 22:32 stocell1_DISK11 -> /dev/sdl
lrwxrwxrwx 1 root root 8 Oct 25 22:32 stocell1_DISK12 -> /dev/sdm
lrwxrwxrwx 1 root root 8 Oct 25 22:34 stocell1_DISK13 -> /dev/sdn
lrwxrwxrwx 1 root root 8 Oct 25 22:34 stocell1_DISK14 -> /dev/sdo
lrwxrwxrwx 1 root root 8 Oct 25 22:34 stocell1_DISK15 -> /dev/sdp
lrwxrwxrwx 1 root root 8 Oct 25 22:34 stocell1_DISK16 -> /dev/sdq
lrwxrwxrwx 1 root root 8 Oct 25 22:34 stocell1_DISK17 -> /dev/sdr
lrwxrwxrwx 1 root root 8 Oct 25 22:34 stocell1_DISK18 -> /dev/sds
[root@stocell1 raw]#
[root@stocell1 raw]# fdisk -l |grep “B,”
Disk /dev/sda: 26.8 GB, 26843545600 bytes
Disk /dev/sdb: 524 MB, 524288000 bytes
Disk /dev/sdc: 524 MB, 524288000 bytes
Disk /dev/sdd: 524 MB, 524288000 bytes
Disk /dev/sde: 524 MB, 524288000 bytes
Disk /dev/sdf: 524 MB, 524288000 bytes
Disk /dev/sdg: 524 MB, 524288000 bytes
Disk /dev/sdh: 524 MB, 524288000 bytes
Disk /dev/sdi: 524 MB, 524288000 bytes
Disk /dev/sdj: 524 MB, 524288000 bytes
Disk /dev/sdk: 524 MB, 524288000 bytes
Disk /dev/sdl: 524 MB, 524288000 bytes
Disk /dev/sdm: 524 MB, 524288000 bytes
Disk /dev/sdn: 419 MB, 419430400 bytes
Disk /dev/sdo: 419 MB, 419430400 bytes
Disk /dev/sdp: 419 MB, 419430400 bytes
Disk /dev/sdq: 524 MB, 524288000 bytes
Disk /dev/sds: 419 MB, 419430400 bytes
Disk /dev/sdr: 419 MB, 419430400 bytes
Disk /dev/mapper/vg_stocell3-lv_root: 23.6 GB, 23630708736 bytes
Disk /dev/mapper/vg_stocell3-lv_swap: 2684 MB, 2684354560 bytes
~

Reply
- dbaesp says:
  
  26/10/2014 at 20:48
  
  Hi,
  Usually such kind of error are related to host/cellinit/eth configuration. Post them or check with info in the blog.
  
  You should also fix disk names: storage cell sw automatically recognizes flash disks by name with FLASH string inside… 🙂 it seems a joke but…
  
  Reply
Greg Y. says:

27/10/2014 at 02:44

Hi,

Thanks for prompt reply. I will fix the disk name shortly but I am always confused with the netwok. I thought that there was some network issue but couldn’t figure out. I could ping HOST (192.168..1.5) from stocell1 (192.168.1.52), and ping back. I tried both localhost IP 127.0.0.1 and static IP 192.168.1.52 for the VM. but didn’t work. Your help is greatly appreciated.

Here is the HOST ifcfg-eth0:

DEVICE=eth0
TYPE=Ethernet
UUID=315ac4fe-111f-4542-a937-dab7c0567f68
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=yes
IPV6INIT=no
NAME=”System eth0″
NETMASK=255.255.255.0
USERCTL=no
HWADDR=00:1F:29:DE:8B:3E
IPADDR=192.168.1.5
PREFIX=24
GATEWAY=192.168.1.1
DNS1=192.168.1.1
LAST_CONNECT=1413740127

HOST ifcfg-eth1:

DEVICE=eth1
TYPE=Ethernet
UUID=e7d3e3c9-ad13-472b-900e-1b91486c45c0
ONBOOT=no
NM_CONTROLLED=yes
BOOTPROTO=dhcp
DEFROUTE=yes
IPV4_FAILURE_FATAL=yes
IPV6INIT=no
NAME=”System eth1″
HWADDR=00:1F:29:DE:8B:42
PEERDNS=yes
PEERROUTES=yes

VM stocell1: ifcfg-eth0:

DEVICE=eth0
TYPE=Ethernet
UUID=e7fd04da-9ce8-4143-9f99-29ebc3372c71
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=dhcp
DEFROUTE=yes
IPV4_FAILURE_FATAL=yes
IPV6INIT=no
NAME=”System eth0″
HWADDR=08:00:27:38:C4:DC
PEERDNS=yes
PEERROUTES=yes
LAST_CONNECT=1414337451

VM stocell1: ifcfg-eth1

DEVICE=eth1
TYPE=Ethernet
UUID=e7c64609-0f37-4c04-ad9f-984201e6bc49
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=none
IPADDR=192.168.1.52
PREFIX=24
GATEWAY=192.168.1.1
DEFROUTE=yes
IPV4_FAILURE_FATAL=yes
IPV6INIT=no
NAME=”System eth1″
HWADDR=08:00:27:56:63:36
LAST_CONNECT=1414349747

On VM stocell1, the network is below:

Adapter1 is attached to “Bridged Adapter” and the name is “eth0”
Adapter2 is attached to “Host-only Adapter” and the name is “vboxnet0”

Reply
Greg Y. says:

28/10/2014 at 02:04

Hi,

I fixed the Disk name to FLASH, and fixed some errors, now the config is below, but I am still getting the same error about the HTTP port 8888.

HOST Virtualbox Setting
Host-only Networks vboxnet0 IPv4 =192.168.56.1,
IPv4 Network Mask=255.255.255.0
IPv6 = (there are some numbers, can’t remove them),
IPv6 Network Mask Length=64
eth0 Method: Manual
IPv4=192.168.1.5
Netmask=255.255.255.0
Gateway=192.168.1.1

VM stocell1 Setting
Host-only Networks vboxnet0

eth0 Method: Automatic (DHCP)
eth1 IPv4 = 192.168.56.101
Netmask=255.255.255.0
Gateway=192.168.1.1

I noticed in the Cell install log “.install_log.txt”, the installation inflated oc4jpatch to /tmp and tried to apply, but it says

apply -jdk /usr/java/jdk1.5.0_15/ -oh /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/oc4j/ms -silent /tmp/oc4jpatch/7439847 command failed: No such file or directory

The oc4jpatch actually does not exist in /tmp, wonder if it was deleted unapplied. Not sure if it matters.

Reply
Greg Y. says:

28/10/2014 at 02:11

And in cellinit.ora:

version=0.0
HTTP_PORT=8888
bbuChargeThreshold=800
SSL_PORT=23943
RMI_PORT=23791
bbuTempThreshold=60
DEPLOYED=TRUE
JMS_PORT=9127
BMC_SNMP_PORT=162

Reply
Greg Y. says:

29/10/2014 at 04:44

OK, figured out what happened to http port 8888. After reboot, the firewall came up. shutdown Firewall, SELINUX, and shutdown IPv6 fixed that issue. However, I still don’t seem to get passed the Cell creation part. Below is the error:

cellcli -e create cell interconnect1=eth1

CELL-02598: Ipaddress/Netmask attribute is not properly configured for interconnect eth1

Here is eth1:

Address: 192.168.56.12 ( this is the fix IP that I gave to stocell2)
Netmask: 255.255.255.0
Gateway: 192.168.1.1 (What should this Gateway IP be, the router IP 192.168.1.1, or the Host IP 192.168.56.1 ? It doesn’t
matter either way however)

Any ideas?

Reply
- dbaesp says:
  
  29/10/2014 at 06:40
  
  I’m not sure if there is something related to os version you choose. By the way the simulated infiniband should be … 56.xxx on your env. Both machines should have an ip on that network. Correct routing can be configured also in eth1-route and eth1-rule configuration files (under same path of eth1 net config file. But I think is only a performance problem if you can ping/ssh from a machine to the other
  
  Reply
Greg Y. says:

01/11/2014 at 02:31

You are 100% correct that it was indeed the OS version issue. I lowered it to 5.10 and worked smoothly. Thanks!

Reply
- dbaesp says:
  
  01/11/2014 at 08:52
  
  I’m happy that was useful. Have fun
  
  Reply
Omar says:

19/04/2015 at 21:55

Hi,

First i would like to say that this is a wonderful way of making the virtual environment for oracle exadata. I follow all the steps but got stacked with following. It would greatly appreciated if you could help me.

[celladmin@localhost ~]$ cellcli -e alter cell restart services all

Stopping the RS, CELLSRV, and MS services…
The SHUTDOWN of services was successful.
Starting the RS, CELLSRV, and MS services…
Getting the state of RS services… running
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful. Error: Start Timed out
Starting MS services…
The STARTUP of MS services was successful.

here is the output from alert.log file:

Cache Allocation: BufferSize: 32768. Num buffers: 5000. Start Address: 2AE48B076000
Cache Allocation: BufferSize: 65536. Num buffers: 5000. Start Address: 2AE494CB7000
Cache Allocation: BufferSize: 10485760. Num buffers: 7. Start Address: 2AE4A8538000
CELL communication is configured to use 1 interface(s):
192.168.56.102
[RS] Started Service MS with pid 5717
Sun Apr 19 16:33:47 2015
IPC version: Oracle UDP/IP (generic)
IPC Vendor 1 Protocol 2
Version 4.1
Sun Apr 19 16:34:16 2015
[RS] Process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrsomt (pid: 5714) received exception [signal num: 14] [ADDR:0x0]
Sun Apr 19 16:34:16 2015
Sun Apr 19 16:34:16 2015State dump completed for Cellsrv
Sun Apr 19 16:34:16 2015
State dump signal delivered to Cellsrv by RS.
Sun Apr 19 16:34:21 2015
State dump interrupted for Cellsrv by RS. It did not complete in 5 seconds.
Clean shutdown signal delivered to OSS
[RS] monitoring process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrsomt (pid: 0) returned with error: 124

Reply
- dbaesp says:
  
  20/04/2015 at 13:07
  
  Ciao,
  please describe your env: OS version, RAM, cell/db version, IP setings if different from the post…
  
  Reply
  - Omar says:
    
    21/04/2015 at 04:03
    
    os version: same as you described in your site
    
    Linux localhost.localdomain 2.6.18-371.el5 #1 SMP Mon Sep 30 16:34:30 PDT 2013 x86_64 x86_64 x86_64 GNU/Linux
    
    memory info: this time i bump it upto 4gb
    
    MemTotal: 4050948 kB
    MemFree: 1443844 kB
    Buffers: 132368 kB
    Cached: 840552 kB
    SwapCached: 0 kB
    Active: 987032 kB
    Inactive: 535028 kB
    HighTotal: 0 kB
    HighFree: 0 kB
    LowTotal: 4050948 kB
    LowFree: 1443844 kB
    SwapTotal: 4095992 kB
    SwapFree: 4095992 kB
    Dirty: 76 kB
    Writeback: 0 kB
    AnonPages: 549172 kB
    Mapped: 89868 kB
    Slab: 78532 kB
    PageTables: 29480 kB
    NFS_Unstable: 0 kB
    Bounce: 0 kB
    CommitLimit: 5647352 kB
    Committed_AS: 1906168 kB
    VmallocTotal: 34359738367 kB
    VmallocUsed: 47568 kB
    VmallocChunk: 34359690275 kB
    HugePages_Total: 463
    HugePages_Free: 463
    HugePages_Rsvd: 451
    Hugepagesize: 2048 kB
    
    cell/db version: same as you described in your site
    
    ip configuration: i have used an static ip for eth1 192.168.56.50
    
    [root@localhost ~]# ifconfig
    eth0 Link encap:Ethernet HWaddr 08:00:27:14:B8:D1
    inet addr:192.168.1.104 Bcast:192.168.1.255 Mask:255.255.255.0
    inet6 addr: fe80::a00:27ff:fe14:b8d1/64 Scope:Link
    UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
    RX packets:22807 errors:0 dropped:0 overruns:0 frame:0
    TX packets:14265 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:1000
    RX bytes:25594517 (24.4 MiB) TX bytes:2286503 (2.1 MiB)
    
    eth1 Link encap:Ethernet HWaddr 08:00:27:BA:79:49
    inet addr:192.168.56.50 Bcast:192.168.56.255 Mask:255.255.255.0
    inet6 addr: fe80::a00:27ff:feba:7949/64 Scope:Link
    UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
    RX packets:1143 errors:0 dropped:0 overruns:0 frame:0
    TX packets:872 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:1000
    RX bytes:133956 (130.8 KiB) TX bytes:118559 (115.7 KiB)
    
    lo Link encap:Local Loopback
    inet addr:127.0.0.1 Mask:255.0.0.0
    inet6 addr: ::1/128 Scope:Host
    UP LOOPBACK RUNNING MTU:16436 Metric:1
    RX packets:22806 errors:0 dropped:0 overruns:0 frame:0
    TX packets:22806 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:0
    RX bytes:10152347 (9.6 MiB) TX bytes:10152347 (9.6 MiB)
    
    create cell error:
    
    [celladmin@localhost trace]$ cellcli -e create cell exacell interconnect1=eth1
    Cell exacell successfully created
    Starting CELLSRV services…
    The STARTUP of CELLSRV services was not successful. Error: Start Timed out
    
    restart cell error:
    
    [celladmin@localhost trace]$ cellcli -e alter cell restart services all
    
    Stopping the RS, CELLSRV, and MS services…
    The SHUTDOWN of services was successful.
    Starting the RS, CELLSRV, and MS services…
    Getting the state of RS services… running
    Starting CELLSRV services…
    The STARTUP of CELLSRV services was not successful. Error: Start Timed out
    Starting MS services…
    The STARTUP of MS services was successful.
    
    output from alert.log file:
    
    [celladmin@localhost trace]$ tail -50 alert.log
    [RS] Started Service RS_BACKUP with pid 22515
    [RS] Kill previous monitoring process for core RS
    Mon Apr 20 22:57:21 2015
    [RS] Started monitoring process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrssmt with pid 22524
    Mon Apr 20 22:57:21 2015
    Successfully setting event parameter –
    Mon Apr 20 22:57:21 2015
    Successfully setting event parameter –
    CELLSRV process id=22517
    CELLSRV cell host name=localhost.localdomain
    CELLSRV version=11.2.3.2.1,label=OSS_11.2.3.2.1_LINUX.X64_130109,Wed_Jan__9_06:09:48_PST_2013
    OS Hugepage status:
    Total/free hugepages available=451/451; hugepage size=2048KB
    OS Stats: Physical memory: 3956 MB. Num cores: 1
    CELLSRV configuration parameters:
    version=0.0
    Physical memory on machine: 3956 MB.
    Memory reserved for cellsrv: 2356 MBMemory for other processes: 1600 MB.
    celldisk policy config read from /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/deploy/config/cdpolicy.dat with ver no. 1 and pol no. 0
    Auto Online Feature 1.3
    CellServer MD5 Binary Checksum: 4701d6c7fd467f39a4f5e0dc2a4370d2
    OS Hugepage status:
    Total/free hugepages available=463/463; hugepage size=2048KB
    MS_ALERT HUGEPAGE CLEAR
    Cache Allocation: Num 1MB hugepage buffers: 900 Num 1MB non-hugepage buffers: 0
    Cache Allocation: BufferSize: 512. Num buffers: 5000. Start Address: 2B83AD753000
    Cache Allocation: BufferSize: 2048. Num buffers: 5000. Start Address: 2B83AD9C5000
    Cache Allocation: BufferSize: 4096. Num buffers: 5000. Start Address: 2B83AE38A000
    Cache Allocation: BufferSize: 8192. Num buffers: 10000. Start Address: 2B83AF713000
    Cache Allocation: BufferSize: 16384. Num buffers: 5000. Start Address: 2B83B4534000
    Cache Allocation: BufferSize: 32768. Num buffers: 5000. Start Address: 2B83B9355000
    Cache Allocation: BufferSize: 65536. Num buffers: 5000. Start Address: 2B83C2F96000
    Cache Allocation: BufferSize: 10485760. Num buffers: 7. Start Address: 2B83D6817000
    CELL communication is configured to use 1 interface(s):
    192.168.56.50
    [RS] Started Service MS with pid 22522
    Mon Apr 20 22:57:32 2015
    IPC version: Oracle UDP/IP (generic)
    IPC Vendor 1 Protocol 2
    Version 4.1
    Mon Apr 20 22:58:01 2015
    [RS] Process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrsomt (pid: 22516) received exception [signal num: 14] [ADDR:0x0]
    Mon Apr 20 22:58:01 2015
    Mon Apr 20 22:58:01 2015State dump completed for Cellsrv
    Mon Apr 20 22:58:01 2015
    State dump signal delivered to Cellsrv by RS.
    Mon Apr 20 22:58:06 2015
    State dump interrupted for Cellsrv by RS. It did not complete in 5 seconds.
    Clean shutdown signal delivered to OSS
    [RS] monitoring process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrsomt (pid: 0) returned with error: 124
dbaesp says:

15/05/2015 at 15:40

Sorry, I didn’t see that before.
I have no more simulator up and running, then I cannot check at tis moment network configuration.
But it sound strange to me that
>>CELLSRV cell host name=localhost.localdomain
probably this is translated in 127.0.0.1 or 192.168.1.104.
You should probably configure host name according with eth1 …
I’m not sure, I will check as soon as I can get & run the VMs.

Reply
- Omar says:
  
  17/06/2015 at 19:50
  
  Hi,
  
  Thanks for your suggestion it worked. I was able to start up the cellsrv. But now the problem is it won’t work after i restart. So if i need to restart the vm i need to reinstall it otherwise i get the same error. firewall is disable. Can you provide any clue?
  
  thanks
  Omar
  
  Reply
  - dbaesp says:
    
    19/06/2015 at 09:34
    
    Can you send us your network configuration?
    Hosts, eth? Config files under /etc/… , output of commands like hostname, ifconfig -a, …
    Do you see errors in logs? Is your Ram enough?
Mike Kilm says:

28/05/2015 at 02:24

Thanks a lot for putting up a wonderful page.

Quick question regarding the step that you mentioned in your comments regarding the creation of Flash Cache where you mentioned that “FLASH uppercase in link name.”
Can you please highlight in detail what you meant there, and how to create flash cache. You help is appreciated.

Mike

Reply
- dbaesp says:
  
  28/05/2015 at 17:31
  
  Hi Mike,
  If I well remember was related to file names for simulating all cell disks, flash and not.
  Cellsrv expects to find there a symbolic link to the real device (that we don’t have!). That symbolic link for us is directly a file that will be used as device.
  By experimental way we find that if you use upper case FLASH in the file name that file will be considered as a flash disk.
  Having flash disks (or something like that 😉 ) flash cache can be Configured, automatically by cellsrv o by commands.
  
  Reply
Mike Kilm says:

29/05/2015 at 03:34

Thanks, it worked like a charm
I am able to finally create flashcache / flashclog using the tip that you provided.

Reply
dbaesp says:

29/05/2015 at 20:07

Great.
Have fun !

Reply
dcuser says:

25/08/2015 at 20:15

great article.
I am getting the same error that omar reported, but haven’t see how omar/you resolved this issue.
appreciate you update on this.

[celladmin@exacell trace]$ cellcli -e create cell exacell interconnect1=eth1
Cell exacell successfully created
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful. Error: Start Timed out

[celladmin@exacell trace]$ tail -f alert*
Tue Aug 25 12:12:50 2015
[RS] Process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrsomt (pid: 9348) received exception [signal num: 14] [ADDR:0x0]
Tue Aug 25 12:12:50 2015
Tue Aug 25 12:12:50 2015State dump completed for Cellsrv
Tue Aug 25 12:12:50 2015
State dump signal delivered to Cellsrv by RS.
Tue Aug 25 12:12:55 2015
State dump interrupted for Cellsrv by RS. It did not complete in 5 seconds.
Clean shutdown signal delivered to OSS
[RS] monitoring process /opt/oracle/cell11.2.3.2.1_LINUX.X64_130109/cellsrv/bin/cellrsomt (pid: 0) returned with error: 124

[celladmin@exacell trace]$ ifconfig
eth0 Link encap:Ethernet HWaddr 08:00:27:FA:FF:E5
inet addr:192.168.56.199 Bcast:192.168.56.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:356 errors:0 dropped:1 overruns:0 frame:0
TX packets:229 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:36370 (35.5 KiB) TX bytes:32071 (31.3 KiB)

eth1 Link encap:Ethernet HWaddr 08:00:27:71:A7:9E
inet addr:192.168.56.50 Bcast:192.168.56.255 Mask:255.255.255.0
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:21 errors:0 dropped:0 overruns:0 frame:0
TX packets:11 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1304 (1.2 KiB) TX bytes:462 (462.0 b)

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:58998 errors:0 dropped:0 overruns:0 frame:0
TX packets:58998 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:10518775 (10.0 MiB) TX bytes:10518775 (10.0 MiB)

[celladmin@exacell trace]$ cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.56.199 exacell.localhost.com exacell
[celladmin@exacell trace]$ hostname
exacell.localhost.com

[celladmin@exacell trace]$ cat /proc/meminfo
MemTotal: 6074532 kB
MemFree: 594384 kB
Buffers: 2672 kB
Cached: 67376 kB
SwapCached: 8220 kB
Active: 1671668 kB
Inactive: 507560 kB
Active(anon): 1644512 kB
Inactive(anon): 468640 kB
Active(file): 27156 kB
Inactive(file): 38920 kB
Unevictable: 20104 kB
Mlocked: 7840 kB
SwapTotal: 6094844 kB
SwapFree: 6060300 kB
Dirty: 336 kB
Writeback: 0 kB
AnonPages: 2121356 kB
Mapped: 37240 kB
Shmem: 1804 kB
Slab: 69792 kB
SReclaimable: 28164 kB
SUnreclaim: 41628 kB
KernelStack: 3240 kB
PageTables: 29172 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 7588940 kB
Committed_AS: 4708508 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 72944 kB
VmallocChunk: 34359654392 kB
HardwareCorrupted: 0 kB
AnonHugePages: 1351680 kB
HugePages_Total: 1507
HugePages_Free: 1507
HugePages_Rsvd: 1501
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 10176 kB
DirectMap2M: 6232064 kB

Reply
- dbaesp says:
  
  19/09/2015 at 07:45
  
  You point to eth1, with ip …50 but in your host the name is related to …199
  
  I cannot check because the simulator was created to study and then destroyed, but the issue should be config related.
  
  Reply
david says:

03/09/2015 at 17:04

Is it required to create griddisks for RECO asm diskgroup? Thanks

Reply
- dbaesp says:
  
  19/09/2015 at 07:38
  
  No, it isn’t!
  
  Reply
Ketan Varia says:

12/09/2015 at 14:22

I am facing below ORA-600 issues while installing Database in Exadata. Could you please help to resolve the issues.

celladmin@cell1 cell11.2.3.3.0_LINUX.X64_131014.1]$ cellcli -e list alerthistory
1 2015-09-12T15:21:17+05:30 critical “RS-700 [No IP found in Exadata config file] [Check cellinit.ora] [] [] [] [] [] [] [] [] [] []”
2 2015-09-12T15:32:07+05:30 critical “RS-700 [No IP found in Exadata config file] [Check cellinit.ora] [] [] [] [] [] [] [] [] [] []”
3 2015-09-12T15:43:08+05:30 critical “RS-700 [No IP found in Exadata config file] [Check cellinit.ora] [] [] [] [] [] [] [] [] [] []”
4 2015-09-12T15:51:38+05:30 critical “RS-700 [No IP found in Exadata config file] [Check cellinit.ora] [] [] [] [] [] [] [] [] [] []”
5 2015-09-12T16:02:47+05:30 critical “RS-700 [No IP found in Exadata config file] [Check cellinit.ora] [] [] [] [] [] [] [] [] [] []”
6 2015-09-12T17:06:18+05:30 critical “ORA-00600: internal error code, arguments: [StorageIdx::getOclSIRegion], [], [], [], [], [], [], [], [], [], [], []”

Reply
- dbaesp says:
  
  19/09/2015 at 07:40
  
  Uhm,
  We don’t have enough info.
  Memory, network conf, cellinit.ora, …
  
  Reply
Amit says:

14/12/2015 at 00:54

Hi

I am having problem creating cell, getting this error in alert.log as (below), could you please help.

OS: OEL 5.10 (Linux cell1.test 2.6.39-400.264.5.el5uek #1 )
Cell Software: cell11.2.3.3.1

[RS] Started monitoring process /opt/oracle/cell11.2.3.3.1_LINUX.X64_140708/cellsrv/bin/cellrsomt with pid 6971
Mon Dec 14 10:34:54 2015
Successfully setting event parameter –
Mon Dec 14 10:34:54 2015
Successfully setting event parameter –
CELLSRV process id=6972
CELLSRV cell host name=cell1.test
CELLSRV version=11.2.3.3.1,label=OSS_11.2.3.3.1_LINUX.X64_140708,Tue_Jul__8_04:01:56_PDT_2014
CELLSRV version md5: 32ea01399bfb4c21a7a732e1946701c3
OS Stats: Physical memory: 5838 MB. Num cores: 2
OS Hugepage status:
Total/free hugepages available=1513/1501; hugepage size=2048KB
CELLSRV configuration parameters:
version=0.0
Physical memory on machine: 5838 MB.
Memory reserved for cellsrv: 4238 MBMemory for other processes: 1600 MB.
Running on simulated hardware in production environment
celldisk policy config read from /opt/oracle/cell11.2.3.3.1_LINUX.X64_140708/cellsrv/deploy/config/cdpolicy.dat with ver no. 2 and pol no. 0
Auto Online Feature 1.6
OS Hugepage status:
Total/free hugepages available=1525/1513; hugepage size=2048KB
MS_ALERT HUGEPAGE CLEAR
Cache Allocation: Num 1MB hugepage buffers: 3000 Num 1MB non-hugepage buffers: 0
Cache Allocation: BufferSize: 512. Num buffers: 5000. Start Address: 7F317A288000
Cache Allocation: BufferSize: 2048. Num buffers: 5000. Start Address: 7F317A4FA000
Cache Allocation: BufferSize: 4096. Num buffers: 5000. Start Address: 7F317AEBF000
Cache Allocation: BufferSize: 8192. Num buffers: 10000. Start Address: 7F317C248000
Cache Allocation: BufferSize: 16384. Num buffers: 5000. Start Address: 7F3181069000
Cache Allocation: BufferSize: 32768. Num buffers: 5000. Start Address: 7F3185E8A000
Cache Allocation: BufferSize: 65536. Num buffers: 5000. Start Address: 7F318FACB000
Cache Allocation: BufferSize: 10485760. Num buffers: 7. Start Address: 7F31A334C000
CELL communication is configured to use 1 interface(s):
192.168.3.100
CELL IP affinity details:
NUMA status: non-NUMA system
cellaffinity.ora status: N/A
CELL communication will use 1 IP group(s):
Grp 0: *192.168.3.100
Mon Dec 14 10:35:04 2015
IPC version: Oracle UDP/IP (generic)
IPC Vendor 1 Protocol 2
Version 4.1
Mon Dec 14 10:35:34 2015
[RS] Process /opt/oracle/cell11.2.3.3.1_LINUX.X64_140708/cellsrv/bin/cellrsomt (pid: 6971) received exception [signal num: 14] [ADDR:0x0]
Mon Dec 14 10:35:34 2015
Mon Dec 14 10:35:34 2015 70 msec State dump completed for CELLSRV
Mon Dec 14 10:35:34 2015
State dump signal delivered to Cellsrv by RS.
Mon Dec 14 10:35:39 2015
State dump interrupted for Cellsrv by RS. It did not complete in 5 seconds.
Clean shutdown signal delivered to CELLSRV by pid – 4438, tid – 0
[RS] monitoring process /opt/oracle/cell11.2.3.3.1_LINUX.X64_140708/cellsrv/bin/cellrsomt (pid: 6971) returned with error: 124

Regards
Amit

Reply
Deep says:

20/08/2016 at 16:09

Hi All,

First of Thanks for creating such great document .Able to create successful cell storage till this step.

I had face same issue which other face :-
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful. Error: Start Timed out

Issue was ETH1 IP starting value and value differ in /etc/hosts . so, have added another entry with ETH1 value and reslove name as stocell1.localhost.com stocell1 + network service restart. Still got error after that.

So, when just change the value in /etc/hosts as stocell1.localhost.com stocell11 and network service restart + VMWARE Machine restart and after retry with cellcli -e create cell stocell1 interconnect1=eth1 . it’s boom!!!! Worked!!! after that all above command works and all are up.

I am not able to understand what is real issue with stocell1 name ?

Below is my current storage cell config:-(can you please say all are config correct or this is wrong ? i am confused here).

CellCLI> list cell detail
name: stocell11
bbuTempThreshold: 60
bbuChargeThreshold: 800
bmcType: absent
cellVersion: OSS_11.2.3.2.1_LINUX.X64_130109
cpuCount: 1
diagHistoryDays: 7
fanCount: 1/1
fanStatus: normal
flashCacheMode: WriteThrough
id: c9938562-c4f9-4bcd-8866-cd07323a3bf4
interconnectCount: 2
interconnect1: eth1
iormBoost: 0.0
ipaddress1: 192.168.19.129/24
kernelVersion: 2.6.32-300.10.1.el5uek
makeModel: Fake hardware
metricHistoryDays: 7
offloadEfficiency: 1.0
powerCount: 1/1
powerStatus: normal
releaseVersion: 11.2.3.2.1
releaseTrackingBug: 14522699
status: online
temperatureReading: 0.0
temperatureStatus: normal
upTime: 0 days, 1:06
cellsrvStatus: running
msStatus: running
rsStatus: running

CellCLI> list celldisk
CD_01_stocell11 normal
CD_02_stocell11 normal
CD_03_stocell11 normal
CD_04_stocell11 normal
CD_05_stocell11 normal
CD_06_stocell11 normal
CD_07_stocell11 normal
CD_09_stocell11 normal
CD_10_stocell11 normal
CD_11_stocell11 normal
CD_12_stocell11 normal
CD_13_stocell11 normal
CD_14_stocell11 normal
CD_15_stocell11 normal
CD_16_stocell11 normal
CD_17_stocell11 normal
CD_18_stocell11 normal
CD_19_stocell11 normal

CellCLI> list griddisk
DATA_CD_01_stocell11 active
DATA_CD_02_stocell11 active
DATA_CD_03_stocell11 active
DATA_CD_04_stocell11 active
DATA_CD_05_stocell11 active
DATA_CD_06_stocell11 active
DATA_CD_07_stocell11 active
DATA_CD_09_stocell11 active
DATA_CD_10_stocell11 active
DATA_CD_11_stocell11 active
DATA_CD_12_stocell11 active
DATA_CD_13_stocell11 active

Thanks
Deep

Reply
- Deep says:
  
  20/08/2016 at 16:11
  
  Typo Above :- Issue command was cellcli -e create cell stocell11 interconnect1=eth1
  
  Reply
krish says:

28/08/2016 at 11:44

Hello ,
I see the error “ORA-00600: internal error code, arguments: [LinuxBlockIO::init]” in the blog, i was able to solve the error by adding the below entry in cellinit.ora

_cellrsdef_heartbeat_timeout=10

Thanks,
Krish

Reply
Rocky says:

11/01/2017 at 20:18

Hi,
Thanks for the excellent post. I tried to setup in my lab and running into following errors
[celladmin@stocell1 ~]$ cellcli -e alter cell restart services all

Stopping the RS, CELLSRV, and MS services…
The SHUTDOWN of services was successful.
Starting the RS, CELLSRV, and MS services…
Getting the state of RS services… running
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful.
CELL-01531: Unable to parse the cellinit.ora file due to incorrect parameters in the file.
Starting MS services…
The STARTUP of MS services was not successful.
CELL-01531: Unable to parse the cellinit.ora file due to incorrect parameters in the file.
[celladmin@stocell1 ~]$

Note : My setup environment is as below
a) VMware Pro 12
b) Oracle Linux 7
c) Installed cell-12.1.2.3.2_LINUX.X64_160721-1.x86_64.rpm
d) Installed jdk1.8.0_66-1.8.0_66-fcs.x86_64.rpm

cellinit.ora is 0 bytes.
[celladmin@stocell1 config]$ cat /opt/oracle/cell12.1.2.3.2_LINUX.X64_160721/cellsrv/deploy/config/cellinit.ora
[celladmin@stocell1 config]$ ls -l /opt/oracle/cell12.1.2.3.2_LINUX.X64_160721/cellsrv/deploy/config/cellinit.ora
-rw-r–r–. 1 celladmin root 0 Jan 11 12:06 /opt/oracle/cell12.1.2.3.2_LINUX.X64_160721/cellsrv/deploy/config/cellinit.ora

[celladmin@stocell1 config]$ cellcli -e create cell stocell1 interconnect1=eno33554984

CELL-01514: Connect Error. Verify that Management Server is listening at the specified HTTP port: 8888.

[root@stocell1 ]# celld status
rsStatus: running
msStatus: stopped
cellsrvStatus: stopped

Please advice

Reply
- Rocky says:
  
  12/01/2017 at 00:11
  
  I was able to proceed 1 step further by adding PATH to lib folder. Now I am stuck at next command
  [celladmin@stocell1 ~]$ cellcli -e alter cell restart services all
  
  Stopping the RS, CELLSRV, and MS services…
  The SHUTDOWN of services was successful.
  Starting the RS, CELLSRV, and MS services…
  Getting the state of RS services… running
  Starting CELLSRV services…
  The STARTUP of CELLSRV services was not successful.
  CELL-01531: Unable to parse the cellinit.ora file due to incorrect parameters in the file.
  Starting MS services…
  The STARTUP of MS services was successful.
  
  [celladmin@stocell1 ~]$ cellcli -e create cell stocell1 interconnect1=eth1
  
  CELL-02598: Ipaddress/Netmask attribute is not properly configured for interconnect eth1.
  
  ———————————————–
  [celladmin@stocell1 ~]$ cat /etc/sysconfig/network-scripts/ifcfg-eth1
  TYPE=Ethernet
  BOOTPROTO=none
  DEFROUTE=yes
  IPV4_FAILURE_FATAL=no
  IPV6INIT=no
  NAME=eth1
  UUID=808d6f42-ab25-4017-b9c4-0a52dc9a42db
  DEVICE=eth1
  ONBOOT=yes
  DNS1=192.168.116.2
  IPADDR=192.168.116.161
  GATEWAY=192.168.116.2
  —————————————————
  [celladmin@stocell1 ~]$ netstat -rn
  Kernel IP routing table
  Destination Gateway Genmask Flags MSS Window irtt Iface
  0.0.0.0 192.168.116.2 0.0.0.0 UG 0 0 0 eth0
  0.0.0.0 192.168.116.2 0.0.0.0 UG 0 0 0 eth0
  0.0.0.0 192.168.116.2 0.0.0.0 UG 0 0 0 eth1
  192.168.116.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
  192.168.116.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
  192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0
  —————————————————
  Please advice
  
  Reply
  - dbaesp says:
    
    12/01/2017 at 10:12
    
    Hi, I deleted the simulator just after the exam and I cannot check on “my” configuration.
    Try setting explicitly NETMASK=255.255.255.0 in eth1 and also post your /etc/hosts
    Ciao
Rocky says:

12/01/2017 at 15:59

Thanks for suggestion.
By setting netmask explicitly, I was able to move again 1 step forward but still CELLSRV could not start.

CellCLI> alter cell restart services all

Stopping the RS, CELLSRV, and MS services…
The SHUTDOWN of services was successful.
Starting the RS, CELLSRV, and MS services…
Getting the state of RS services… running
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful.
CELL-01547: CELLSRV startup failed due to unknown reasons.
Starting MS services…
The STARTUP of MS services was successful.

CellCLI> exit
quitting

[celladmin@stocell1 ~]$ cellcli -e create cell stocell1 interconnect1=eth1
Cell stocell1 successfully created
Starting CELLSRV services…
The STARTUP of CELLSRV services was not successful.
CELL-01547: CELLSRV startup failed due to unknown reasons.

=======================================================================================
I see now cellinit.ora also got populated with below entry
cat /opt/oracle/cell/cellsrv/deploy/config/cellinit.ora
#CELL Initialization Parameters
ipaddress1=192.168.116.161/24

=======================================================================================
[root@stocell1 trace]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.116.151 stocell1
192.168.116.161 stocell1-ib
=======================================================================================
[root@stocell1 trace]# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 192.168.116.2 0.0.0.0 UG 0 0 0 eth1
0.0.0.0 192.168.116.2 0.0.0.0 UG 0 0 0 eth1
0.0.0.0 192.168.116.2 0.0.0.0 UG 0 0 0 eth0
192.168.116.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
192.168.116.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0
=======================================================================================
[root@stocell1 trace]# ifconfig -a
eth0: flags=4163 mtu 1500
inet 192.168.116.151 netmask 255.255.255.0 broadcast 192.168.116.255
inet6 fe80::20c:29ff:fe07:2c3c prefixlen 64 scopeid 0x20
ether 00:0c:29:07:2c:3c txqueuelen 1000 (Ethernet)
RX packets 67346 bytes 4076442 (3.8 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 16 bytes 1128 (1.1 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

eth1: flags=4163 mtu 1500
inet 192.168.116.161 netmask 255.255.255.0 broadcast 192.168.116.255
inet6 fe80::20c:29ff:fe07:2c46 prefixlen 64 scopeid 0x20
ether 00:0c:29:07:2c:46 txqueuelen 1000 (Ethernet)
RX packets 68830 bytes 4224925 (4.0 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1802 bytes 1386484 (1.3 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73 mtu 65536
……
…..
…..
=======================================================================================
[root@stocell1 trace]# hostname
stocell1
[root@stocell1 trace]# ping 192.168.116.151
PING 192.168.116.151 (192.168.116.151) 56(84) bytes of data.
64 bytes from 192.168.116.151: icmp_seq=1 ttl=64 time=0.079 ms
64 bytes from 192.168.116.151: icmp_seq=2 ttl=64 time=0.030 ms
^C
— 192.168.116.151 ping statistics —
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.030/0.054/0.079/0.025 ms
[root@stocell1 trace]# ping 192.168.116.161
PING 192.168.116.161 (192.168.116.161) 56(84) bytes of data.
64 bytes from 192.168.116.161: icmp_seq=1 ttl=64 time=0.029 ms
64 bytes from 192.168.116.161: icmp_seq=2 ttl=64 time=0.053 ms
64 bytes from 192.168.116.161: icmp_seq=3 ttl=64 time=0.055 ms
^C
— 192.168.116.161 ping statistics —
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.029/0.045/0.055/0.014 ms
=======================================================================================

Please advice.

Reply
- Rocky says:
  
  12/01/2017 at 20:17
  
  Latest Errors :
  [celladmin@stocell1 ~]$ cellcli -e create cell stocell1 interconnect1=eth1
  Cell stocell1 successfully created
  Starting CELLSRV services…
  The STARTUP of CELLSRV services was not successful.
  CELL-01547: CELLSRV startup failed due to unknown reasons.
  
  Alert Log
  ————
  CELL process id=12832
  CELL host name=stocell1
  CELL version=12.1.2.3.2,label=OSS_12.1.2.3.2_LINUX.X64_160721,Thu_Jul_21_09:50:44_PDT_2016
  CELLSRV version md5: 150f7acd0b05d50095223fb9399e36ee
  OS Stats: Physical memory: 5792 MB. Num cores: 1
  CELLSRV configuration parameters:
  Memory reserved for cellsrv: 2892 MB Memory for other processes: 2900 MB
  Running on simulated hardware in production environment
  Successfully allocated 256 MB for Storage Index. Storage Index memory usage can grow up to a maximum of 289 MB.
  CELL communication is configured to use 1 interface(s):
  192.168.116.161
  IPC version: Oracle UDP/IP (generic)
  IPC Vendor 1 Protocol 2
  Version 4.1
  MS_ALERT HUGEPAGE CLEAR
  ossmmap_map: mmap failed for SparseV2PhysMap len: 12800 as there is insufficient memory
  Dumping oal memory statistics (all values in bytes)
  cellsrv: total os mem: 6012474792 sga osmem: 1375731712 pga osmem: 1086888
  cellsrv: sga alloc mem: 1145246520 pga alloc mem: 510120
  group: total os mem: 0 ocl: 3145728
  Memtype: sga: cellsrv os mem 1375731712 all group os mem 0
  Memtype: pga: cellsrv os mem 1086888 all group os mem 0
  Memtype: cache: cellsrv os mem 3962249216 all group os mem 0
  Memtype: storidx: cellsrv os mem 289431552 all group os mem 0
  Memtype: heapsummary: cellsrv os mem 18022400 all group os mem 0
  Memtype: codetext: cellsrv os mem 78643200 all group os mem 0
  Memtype: malloc: cellsrv os mem 33554432 all group os mem 0
  Memtype: stack: cellsrv os mem 253755392 all group os mem 0
  Thu Jan 12 13:12:54 2017
  [RS] monitoring process /opt/oracle/cell12.1.2.3.2_LINUX.X64_160721/cellsrv/bin/cellrsomt (pid: 12830) returned with error: 161
  Errors in file /opt/oracle/cell12.1.2.3.2_LINUX.X64_160721/log/diag/asm/cell/stocell1/trace/svtrc_12832_main.trc (incident=321):
  ORA-00600: internal error code, arguments: [TODO(zutao): handle OOM gracefully], [], [], [], [], [], [], [], [], [], [], []
  Incident details in: /opt/oracle/cell12.1.2.3.2_LINUX.X64_160721/log/diag/asm/cell/stocell1/incident/incdir_321/svtrc_12832_main_i321.trc
  Sweep [inc][321]: completed
  CELLSRV error – ORA-600 internal error
  Thu Jan 12 13:12:55 2017
  CELLSRV is no longer alive before state dump completes.
  Thu Jan 12 13:12:55 2017
  [RS] Stopped Service CELLSRV
  
  Looks like memory related error. Not sure where to adjust.
  
  Reply
  - Rocky says:
    
    12/01/2017 at 22:37
    
    Thanks I will able to proceed by increasing RAM of my machine. Thanks for all your help.
Mohammad alam says:

10/10/2017 at 12:44

Hi, Thanks a lot for sharing the info. I am able to create cell storage successfully, I want to know do i need to install exadata software in db node also. I configure 2 db node and 3 cell node, while running root.sh in node1 its failed and not able to
Adding Clusterware entries to inittab
CRS-2672: Attempting to start ‘ora.mdnsd’ on ‘qr01db01’
CRS-2676: Start of ‘ora.mdnsd’ on ‘qr01db01’ succeeded
CRS-2672: Attempting to start ‘ora.gpnpd’ on ‘qr01db01’
CRS-2676: Start of ‘ora.gpnpd’ on ‘qr01db01’ succeeded
CRS-2672: Attempting to start ‘ora.cssdmonitor’ on ‘qr01db01’
CRS-2672: Attempting to start ‘ora.gipcd’ on ‘qr01db01’
CRS-2676: Start of ‘ora.cssdmonitor’ on ‘qr01db01’ succeeded
CRS-2676: Start of ‘ora.gipcd’ on ‘qr01db01’ succeeded
CRS-2672: Attempting to start ‘ora.cssd’ on ‘qr01db01’
CRS-2672: Attempting to start ‘ora.diskmon’ on ‘qr01db01’
CRS-2676: Start of ‘ora.diskmon’ on ‘qr01db01’ succeeded
CRS-2674: Start of ‘ora.cssd’ on ‘qr01db01’ failed
CRS-2679: Attempting to clean ‘ora.cssd’ on ‘qr01db01’
CRS-2681: Clean of ‘ora.cssd’ on ‘qr01db01’ succeeded
CRS-2673: Attempting to stop ‘ora.gipcd’ on ‘qr01db01’
CRS-2677: Stop of ‘ora.gipcd’ on ‘qr01db01’ succeeded
CRS-2673: Attempting to stop ‘ora.cssdmonitor’ on ‘qr01db01’
CRS-2677: Stop of ‘ora.cssdmonitor’ on ‘qr01db01’ succeeded
CRS-2673: Attempting to stop ‘ora.gpnpd’ on ‘qr01db01’
CRS-2677: Stop of ‘ora.gpnpd’ on ‘qr01db01’ succeeded
CRS-2673: Attempting to stop ‘ora.mdnsd’ on ‘qr01db01’
CRS-2677: Stop of ‘ora.mdnsd’ on ‘qr01db01’ succeeded
CRS-4000: Command Start failed, or completed with errors.
CSS startup failed with return code 1
The exlusive mode cluster start failed, see Grid Infrastructure alert log for more information
Initial cluster configuration failed. See /oraeng/GI/cfgtoollogs/crsconfig/rootcrs_qr01db01.log for details
/oraeng/GI/perl/bin/perl -I/oraeng/GI/perl/lib -I/oraeng/GI/crs/install /oraeng/GI/crs/install/rootcrs.pl execution failed
[root@qr01db01 ~]# ifconfig

in ocsd logfile showing error.

71: [ CSSD][1080420672]clssgmDeadProc: proc 0x2a81e50
2017-10-07 09:53:52.185: [ CSSD][1080420672]clssgmDestroyProc: cleaning up proc(0x2a81e50) con(0x1b0) skgpid ospid 10755 with 0 clients, refcount 0
2017-10-07 09:53:52.186: [ CSSD][1080420672]clssgmDiscEndpcl: gipcDestroy 0x1b0
2017-10-07 09:53:52.226: [ CSSD][1080420672]clssscSelect: cookie accept request 0x26a2c10
2017-10-07 09:53:52.226: [ CSSD][1080420672]clssgmAllocProc: (0x2a81ef0) allocated
2017-10-07 09:53:52.226: [ CSSD][1080420672]clssgmClientConnectMsg: properties of cmProc 0x2a81ef0 – 1,2,3,4,5
2017-10-07 09:53:52.226: [ CSSD][1080420672]clssgmClientConnectMsg: Connect from con(0x200) proc(0x2a81ef0) pid(10755) version 11:2:1:4, properties: 1,2,3,4,5
2017-10-07 09:53:52.226: [ CSSD][1080420672]clssgmClientConnectMsg: msg flags 0x0000
2017-10-07 09:53:54.257: [ SKGFD][1099741504]ERROR: -8(OS Error 1 (bind_fail,skgxpvifconf,requested interface 10.0.0.50 failed bind. Check output from ifconfig command,Error 0)
)
2017-10-07 09:53:54.257: [ SKGFD][1099741504]ERROR: -10(OSS Operation oss_initialize failed with error 4 [Network initialization failed]
)
2017-10-07 09:53:54.258: [ CSSD][1099741504]clsssnmvDDiscThread: Unable to create clsf context
2017-10-07 09:53:54.258: [ CSSD][1099741504]###################################
2017-10-07 09:53:54.258: [ CSSD][1099741504]clssscExit: CSSD aborting from thread clssnmvDDiscThread
2017-10-07 09:53:54.258: [ CSSD][1099741504]###################################
2017-10-07 09:53:54.258: [ CSSD][1099741504](:CSSSC00012:)clssscExit: A fatal error occurred and the CSS daemon is terminating abnormally
2017-10-07 09:53:54.258: [ CSSD][1099741504]

Though i checked eth interface is up and running and able to ping from both node

Please help me on this.

Reply
hari says:

28/07/2018 at 17:13

Thanks a lot for this post….

Reply
Patrickvok says:

28/04/2019 at 22:23

Buy game accounts cheap

Reply