Relinking Oracle Grid Home on RAC

 

Couple of notes from Oracle Metalink when relinking Oracle and Grid Home.


Please note when relinking the grid home the /u01/app/oraInventory/locks directory had to have the permissions changed when the grid relink ran otherwise the relink did not work.

Also, after bringing back some of our boxes and running "crsctl start crs"  the cluster was not starting on that node.

After running "oracleasm listdisks" - it turned out some of the diskgroups were missing.

Under investigation.

We ran "oracleasm scandisks" which discovered the disks but still no success with "crsctl stat res -t"

The only option was to run "crstcl stop crs -f" and force it down. 

Once restarted the server was ok.

Please check that oracleasm scandisks returns the expected values prior to running crsctl start crs.

The text in red are the commands to run once you have stopped all the services on the node you are about to relink.


If you need to "bounce" the server say for a kernel update please run "crsctl disable crs" or the clustering will start once the server restarts and you will have to stop everything again. Please remember to "crsctl enable crs" when you are happy.

Abridged notes from the Metalink Note 1536057.1) :

How To Relink The Oracle Grid Infrastructure RAC/Cluster Installation - v12.2

In order to relink the Oracle Grid Infrastructure RAC/Cluster Installation, please follow the next steps after stopping all the services running:


1) As root OS user, please unlock the Grid Infrastructure Oracle Home (on every node) as follows (set the correct grid home env first) :

# cd < Grid Infrastructure Oracle Home >/crs/install
# rootcrs.sh -unlock

2) As the Oracle Grid Infrastructure owner (on every node) relink the Grid Infrastructure Oracle Home as follows:

$ export ORACLE_HOME=< Grid Infrastructure Oracle Home >
$ < Grid Infrastructure Oracle Home>/bin/relink all

3) As root OS user again, please execute the next commands (on every node):

# cd < Grid Infrastructure Oracle Home >/rdbms/install/
# ./rootadd_rdbms.sh
# cd < Grid Infrastructure Oracle Home >/crs/install
# rootcrs.sh -lock


Then "crsctl start crs" - please see note above re oracleasm listdisks

 

Note: CRS services (CRS, CSS ASM instances, diskgroups, listeners, DB instances, etc.) will automatically start.

 
4) Please review the relink.log file located at the following directory (to confirm or discard any error):

$ < Grid Infrastructure Oracle Home>/install/relink.log


See also :

Relinking Oracle Home FAQ ( Frequently Asked Questions) (Doc ID 1467060.1)


Last date from a group in oracle table


Asked to run a query against a two column table, one column having a procedure name and the other with the date the procedure was last ran.

e.g.

PROC_NAME                      START_DTM
------------------------------ ---------
TVM Cash Balance               05-NOV-20
TVM Cash Balance               12-NOV-20
TVM Cash Balance               02-DEC-20
TVM Cash Balance               01-DEC-20
TVM Cash Balance               03-DEC-20
TVM Cash Balance               04-DEC-20
TVM Cash Balance               30-SEP-20
TVM Cash Balance               18-SEP-20
TVM Cash Balance               01-SEP-20
TVM Cash Balance               29-SEP-20
TVM Cash Balance               12-SEP-20

We needed to select the last date for each procedure.

This works :

select proc_name, max(start_dtm) from dw_main.proc_run_log where proc_name in (select distinct(proc_name) from dw_main.proc_run_log) group by proc_name order by proc_name;


PROC_NAME                                         MAX(START
------------------------------                            ---------
Summarize Tap RW Error Summary     04-DEC-20
Summarize Tap Speed Summary          04-DEC-20
Summarize Txn Timing                          04-DEC-20
TVM Cash Balance                                04-DEC-20
Update Card Statuses                           04-DEC-20
....
131 rows selected.


We wanted to double check (and to show how this could also be achieved in pl/sql) to be absolutely certain so put this together (the sql is also easier to understand ):

create or replace procedure ruslist
    is
    proc_rus varchar2(30);
    start_rus date;
CURSOR d1 IS SELECT distinct proc_name from dw_main.proc_run_log order by proc_name;
CURSOR d2 IS SELECT  max(start_dtm) from  dw_main.proc_run_log where proc_name = proc_rus;
BEGIN
    open d1;
    LOOP
    FETCH d1 into proc_rus;
  EXIT when d1% NOTFOUND;
    dbms_output.put( proc_rus);
   open d2;
   LOOP
    FETCH d2 into start_rus;
   EXIT when d2% NOTFOUND;
    dbms_output.put_line('      '||start_rus);
  END LOOP;
CLOSE d2;
  END LOOP;
CLOSE d1;
EXCEPTION
   WHEN OTHERS THEN
      raise_application_error(-20001,'An error was encountered - '||SQLCODE||' -ERROR- '||SQLERRM);
END ruslist;

......
Summarize Tap RW Error Summary       04-DEC-20
Summarize Tap Speed Summary            04-DEC-20
Summarize Txn Timing                            04-DEC-20
TVM Cash Balance                                  04-DEC-20
Update Card Statuses                              04-DEC-20

PL/SQL procedure successfully completed.

SQL>

Happy days.



A bit of fun.....


Although the games under MobaXTerm can be entertaining wanted something with a bit fun at a Linux level. Some interesting commands are "sl" - Steam Locomotive, to annoy admin people who can't type ls, "cowsay" (what can I say) and "figlet".  

More here :





$yum install sl
$yum install cowsay
$yum install figlet



Let me know if you have used any others or have your own silly commands.










SQL Server Access

Thanks to Graham Brown our resident SQL server guru for this - thought it is worth sharing.

This has happened a few times where SQL instances have been built or cloned by ‘other’ people – who then forget about who is meant to be supporting the instances … or just  go on holiday  ðŸ˜Š

 No Dba access

 If we cannot get access with our windows logins you can force your way in via Single User mode & SQLCMD

 See this :

https://www.sqlshack.com/recover-lost-sa-password/

Once you have gained access yourself please remember to add all the other DBA’s in the team… you know it makes sense!

If you have done the step above and still cannot log on – check the error log

(C:\Program Files\Microsoft SQL Server\MSSQL13.DYNAMICS_CRM_SQL\MSSQL\Log\errorlog for example)  – in the case of the instance I could see this just after the startup :

 2020-09-29 08:14:27.92 Server      Authentication mode is WINDOWS-ONLY.

 And then after I had tried to log on as the NewSA user created previously – and failed – in the log I could see this :

 2020-10-05 03:47:20.42 Logon       Login failed for user 'NewSA'. Reason: An attempt to login using SQL authentication failed. Server is configured for Windows authentication only. [CLIENT: <local machine>]

Basically whoever has created the instance has left it at ‘Windows only authentication’  - now although I can understand this depending on the application the instance is being used for it it’s a pain when we have the lack of login issue … and without the ability to use SSMS to change it to ‘SQL Server and Windows Authentication’ we have to hack the registry :

 https://www.top-password.com/knowledge/sql-server-authentication-mode.html

 One last thing - lately I have found is that people are not setting the SQL Server Network Configuration protocols correctly – for anybody to connect via a client / network connection the TCP/IP protocol must be enabled (as well as Allow Remote Connection to this server obviously) – this is found in SQL Server Configuration Manager and when done correctly looks like this :

 


Can't open perl script - RemoteHostExecutor.pl

Despite banging on about making sure you have the latest OPatch applied on your database / grid homes today I was caught out.

If you have different OPatch versions between the grid and database homes you hit a mismatch error but if you have different versions on different RAC nodes you will hit this :

[root@ora01 gridhome]# $ORACLE_HOME/OPatch/opatchauto apply /u99/media/jul2020/31305382 -oh /u01/app/12.2.0/gridhome -analyze

Can't open perl script "/u01/app/12.2.0/gridhome/OPatch/auto/database/bin/RemoteHostExecutor.pl": (null)

oracle.dbsysmodel.driver.sdk.productdriver.ProductDriverException: Unable to execute command : Can't open perl script "/u01/app/12.2.0/gridhome/OPatch/auto/database/bin/RemoteHostExecutor.pl": (null)

OPatchAuto failed.

If you do a quick search you will be told to apply the latest OPatch but I was already using the latest version but once same version on all nodes - happy days!

So latest version all homes and all nodes.







Rman duplicate database - quick notes.

Quick notes on duplicating database following level 0 rman backup.

Wanted a duplicate of a db called MPROD to one called MPRODX.

So ran the rman backup and copied the backup to another server.

Started the mprodx database with a "skeleton" pfile - the db name option should suffice.

Most of the parameters are optional.

 
SQL> startup nomount pfile=/u01/app/oracle/product/12.2.0/dbhome_1/dbs/initmprodx.ora
ORACLE instance started.
 
Total System Global Area 1.6106E+10 bytes
Fixed Size                  4516272 bytes
Variable Size            2281702992 bytes
Database Buffers         1.3791E+10 bytes
Redo Buffers               29036544 bytes

SQL> exit
Disconnected from Oracle Database 12c Enterprise Edition Release 12.2.0.2.0 - 64bit Production
 
Created a script to do the duplicate (the rman backup was encrypted but not a common thing to do, so ignore the decryption option). 

Created the script dupl,txt
 
set echo on;
-- set decryption identified by 'password123' ;
run {
 
allocate auxiliary channel d1  type disk ;
allocate auxiliary channel d2  type disk ;
allocate auxiliary channel d3  type disk ;
allocate auxiliary channel d4  type disk ;
 
duplicate database to MPRODX  backup location '/backup/rman_back';
 
release channel d1 ;
release channel d2 ;
release channel d3 ;
release channel d4 ;
 
}
  
Run the script in the background (change /home/oracle to wherever).

nohup rman auxiliary / cmdfile=/home/oracle/dupl.txt log=/home/oracle/refresh_mprodx.log &

At the end of this should have a duplicate database. You might hit an issue if with files already existing but if you are totally certain that you are not overwriting an existing database you can use the nofilenamecheck option to stop this. Check the log file to confirm ok.

 

Creating multiple databases with dbca

 Asked to create a number of oracle dbs all the same version, all the same.

Manually ran the dbca and saved the response file.

Edited the response file, changed the sys and system password entries and globally replaced the db name so could run the dbca from the command line and let it just do its thing. 

Remember g/^#/d and g/^$/d will replace the commented / blank lines in vi if you want to make the file easier to read.

$ORACLE_HOME/bin/dbca -silent -createDatabase -responseFile /home/oracle/dbca_oradb2.rsp

Did not want to do this for every db so put this together, bit quick and dirty but seems to do the trick.

In the response file changed the database name to dummy and let sed replace it with the correct name; tried to do this with unix variables but dbca would not pick the correct value and was up against the clock so did it this way for quickness.

#!/bin/sh

oracle_env()

{
ORACLE_HOME=/u01/app/oracle/product/12.2.0/dbhome_1; export ORACLE_HOME
PATH=$PATH:$ORACLE_HOME/bin; export PATH
}

oracle_env

array=(oradb1 oradb2 oradb3 oradb4 oradb5 oradb6)

for i in "${array[@]}" ; do
ORACLE_SID=${i}; export ORACLE_SID
echo $ORACLE_SID

sed "s/dummy/$i/g" dbca.rsp > dbca_$i.rsp

$ORACLE_HOME/bin/dbca -silent -createDatabase -responseFile /home/oracle/dbca_$i.rsp

done

When finished you can delete the dbs in the same way.

dbca -silent -deleteDatabase -sourceDB oradb2 -sysDBAUserName sys -sysDBAPassword Password#2020

As always please feel free to improve and let me know.


Downgrade 19c to 12c


Not something I expected to do but was asked to download some 19c dbs after the dev lot hit an issue with 19 and wanted to go back to 12!

Some quick notes  as I don't expect ever to do this again - maybe 😉

[oracle@ora02 dbs]$ . oraenv

Pick the correct 19c databases

The Oracle base remains unchanged with value /u01/app/oracle

[oracle@ora02 dbs]$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Fri Jun 26 09:06:05 2020

Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle. All rights reserved.

Connected to:

Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production

Version 19.3.0.0.0

SQL> create pfile='$ORACLE_HOME/dbs/initcms.ora' from spfile ** change as approp.

2 /

File created.

Shutdown immediate and change init file removing the __unified pga line


Copy init’dbname’.ora to 12c home/dbs/init’dbname’.ora - 12c will need this to startup

e.g.

SQL> shutdown immediate;

Database closed.

Database dismounted.

ORACLE instance shut down.

cp $ORACLE_HOME/dbs/initdb1.ora /u01/app/oracle/product/12.2.0/dbhome_1/dbs/.

Then

SQL> startup downgrade;

ORACLE instance started.

Total System Global Area 2147481656 bytes

Fixed Size 8898616 bytes

Variable Size 553648128 bytes

Database Buffers 1577058304 bytes

Redo Buffers 7876608 bytes

Database mounted.

Database opened.


SQL> set termout on serverout on echo on timing on

SQL> spool /home/oracle/downgrade_db1.log

Run the grant admin and the clean audit or you will see error when running the


ERROR at line 1:

ORA-20001: Downgrade cannot proceed - Unified Audit Trail data exists.Please

clean up the data first using DBMS_AUDIT_MGMT.CLEAN_AUDIT_TRAIL.

ORA-06512: at line 75


SQL> grant ADMINISTER DATABASE TRIGGER to SYSTEM;

Grant succeeded.

SQL> exec DBMS_AUDIT_MGMT.CLEAN_AUDIT_TRAIL (audit_trail_type => DBMS_AUDIT_MGMT.AUDIT_TRAIL_UNIFIED, use_last_arch_timestamp => FALSE);

PL/SQL procedure successfully completed.

SQL> @?/rdbms/admin/catdwgrd.sql

Approx 5 mins…

…….

SQL>

SQL> Rem ***********************************************************************

SQL> Rem END catdwgrd.sql

SQL> Rem ***********************************************************************

SQL>

SQL>shutdown immediate;

SQL> exit


When migrating to 19c the timezone had been upgraded so had to do this to downgrade or hit issues.

export ORA_TZFILE=/u01/app/oracle/product/12.2.0/dbhome_1/oracore/zoneinfo/timezone_26.dat

set environment to Oracle 12 and change sid to the correct database i.e. db1

oracle@ora02 dbs]$ export

ORA_TZFILE=/u01/app/oracle/product/12.2.0/dbhome_1/oracore/zoneinfo/timezone_26.dat

[oracle@usze2qmbtora02 dbs]$ sqlplus / as sysdba

exit

SQL*Plus: Release 12.2.0.1.0 Production on Fri Jun 26 09:51:18 2020

Copyright (c) 1982, 2016, Oracle. All rights reserved.

Connected to an idle instance.

SQL> startup upgrade;

Remember you need to use the 19c init.file copied earlier with the __unified parameter removed or you will hit

ORA-01078: failure in processing system parameters

LRM-00109: could not open parameter file '/u01/app/oracle/product/12.2.0/dbhome_1/dbs/initdb1.ora'

SQL> startup upgrade pfile='/u99/media/init_db1.ora'

LRM-00101: unknown parameter name '__unified_pga_pool_size'

ORA-01078: failure in processing system parameters

SQL> startup upgrade -> need init file copied earlier

ORACLE instance started.

Total System Global Area 2147483648 bytes

Fixed Size 8622776 bytes

Variable Size 654314824 bytes

Database Buffers 1476395008 bytes

Redo Buffers 8151040 bytes

Database mounted.

Database opened.

SQL>

SQL> set termout on echo on timing on

SQL>@?/rdbms/admin/catrelod.sql

Check that this correctly

@?/rdbms/admin/utlrp.sql


SELECT comp_name, status, substr(version,1,10) as version

from dba_server_registry order by modified


COMP_NAME STATUS
----------------------------------- ----------------------------------------

VERSION
----------------------------------------

Oracle Real Application Clusters OPTION OFF

12.2.0.1.0

Oracle Database Catalog Views VALID

12.2.0.1.0

Oracle Database Packages and Types VALID

12.2.0.1.0

Oracle XML Database VALID

12.2.0.1.0

Oracle Workspace Manager VALID

12.2.0.1.0

JServer JAVA Virtual Machine VALID

12.2.0.1.0

Oracle Database Java Packages VALID

12.2.0.1.0

Oracle Text VALID

12.2.0.1.0

Oracle Multimedia VALID

12.2.0.1.0

Oracle XDK VALID

12.2.0.1.0

OLAP Analytic Workspace VALID

12.2.0.1.0

Spatial VALID

12.2.0.1.0

Oracle OLAP API VALID

12.2.0.1.0

Oracle Label Security VALID

12.2.0.1.0

Oracle Database Vault VALID

12.2.0.1.0


15 rows selected.


Shutdown - (change oratab to v12)

Set environment to grid 19c

. oraenv

+ASM

Remove database from srvctl


[oracle@usze2qmbtora02 ~]$ srvctl remove database -d ora01 

This will remove it from the /etc/oratab so manually put it back as 12

Now we need add the database back as 12c and start it as 12c

Still using the grid 19c environment (19c grid to 12 for another day!)

[oracle@usze2qmbtora02 ~]$ /u01/app/oracle/product/12.2.0/dbhome_1/bin/srvctl add database -d oradb1 -o /u01/app/oracle/product/12.2.0/dbhome_1

[oracle@usze2qmbtora02 ~]$ /u01/app/oracle/product/12.2.0/dbhome_1/bin/srvctl start database -db oradb1 

Run “crsctl stat res -t” and check the db is up and running – shows as 12c but under a 19c grid control - bit odd but hey ho. 

After some research found this :

OGG-01416 OGG-01668 Process Abending


After a test box filesystem had filled up.

Cleared down and checked Golden Gate for any issues.

One of the extracts had abended.

EXTRACT     ABENDED     OAM_X       00:00:00      00:30:03

Checked ggserr.log file :

020-07-06T10:26:37.295-0400 ERROR OGG-01416 Oracle GoldenGate Capture for Oracle, oam_x.prm: File ./dirdat/oam/et000003576, with format RELEASE 9.0/9.5, does not match current format specification of RELEASE 12.3. Modify the parameter file to specify format RELEASE 9.0/9.5 or issue ETROLLOVER prior to restart.
2020-07-06T10:26:37.360-0400 ERROR OGG-01668 Oracle GoldenGate Capture for Oracle, oam_x.prm: PROCESS ABENDING.


Quick search on the web and followed the instructions here :

OGG-01411 – Cannot convert input file ./dirdat/xx with format RELEASE 9.0/9.5 to output file ./dirdat/zz


[oracle@usze2qmbtora02 gghome_app]$ ./ggsci

Oracle GoldenGate Command Interpreter for Oracle
Version 12.3.0.1.2 OGGCORE_12.3.0.1.0_PLATFORMS_171208.0005_FBO
Linux, x64, 64bit (optimized), Oracle 12c on Dec 9 2017 00:51:03
Operating system character set identified as UTF-8.

Copyright (C) 1995, 2017, Oracle and/or its affiliates. All rights reserved.

GGSCI (usze2qmbtora02) 1> info OAM_X

EXTRACT OAM_X Last Started 2020-07-06 10:51 Status ABENDED
Checkpoint Lag 00:00:00 (updated 00:43:51 ago)
Log Read Checkpoint Oracle Redo Logs
2020-07-06 10:13:28 Seqno 1909, RBA 236535824
SCN 0.209622245 (209622245)

GGSCI (usze2qmbtora02) 2> stop OAM_X
EXTRACT OAM_X is already stopped.

GGSCI (usze2qmbtora02) 3> alter OAM_X etrollover

2020-07-06 10:58:28 INFO OGG-01520 Rollover performed. For each affected output trail of Version 10 or higher format, after starting the source extract, issue ALTER EXTSEQNO for that trail's reader (either pump EXTRACT or REPLICAT) to move the reader's scan to the new trail file; it will not happen automatically.
EXTRACT altered.

GGSCI (usze2qmbtora02) 4> alter OAM_X begin 2020-07-06 10:13:28
EXTRACT altered.


GGSCI (usze2qmbtora02) 5> start OAM_X

Sending START request to MANAGER ...
EXTRACT OAM_X starting

GGSCI (usze2qmbtora02) 6> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING OAM_X 00:45:17 00:00:14

GGSCI (usze2qmbtora02) 7> info all

Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING OAM_X 00:00:01 00:00:00

Strange why only the one extract failed - will investigate further.



ACL resource



Colleague had an issue with ASM not starting and I think this was due to groups / users being changed during install. Asked to check this and came across this :

[grid@ORA01 ~]$ crsctl stat res ora.asm -f | grep -i acl

ACL=owner:grid:rwx,pgrp:asmdba:r-x,other::r--

should have been (comparing with a working server)

ACL=owner:grid:rwx,pgrp:oinstall:r-x,other::r--

After a lot of digging the command to change this is (why it is necessary to have something unsupported) 

[root@uszw2q2c2ods01 lib]# crsctl setperm resource ora.asm -g 'oinstall' -unsupported

[root@uszw2q2c2ods01 lib]# crsctl stat res ora.asm -f | grep -i acl

ACL=owner:grid:rwx,pgrp:oinstall:r-x,other::r--

DESCRIPTION=Oracle ASM resource

Still have an issue with ASM but made a note of this for future reference

Patching - April 2020


Please test on non-production env.

1.      Download the latest RU


This correct RU can be found by using Oracle Doc ID 2118136.2

Example used in this document is Apr 2020. The GI APR 2020 Release Update 12.2.0.1.200414 includes updates for both the Clusterware home and Database home that can be applied in a standby first method on Data Guard.

Install the Release Update for Apr 2020

The GI APR 2020 Release Update 12.2.0.1.190716 includes updates for both the Clusterware home and Database home that can be applied in a rolling fashion.

This information is found in Oracle Doc ID 2118136.2

3       Download and unzip the patch in a temp directory e.g.

sudo su – grid

# cd /u99/media/apr20/30920127

Check that the directory is empty

# ls

Unzip the patch as the grid home owner (i.e. grid)

unzip p30920127_122010_Linux-x86-64.zip

4        Perform pre-requisite checks for dbhome and grid home

Check the version of opatch installed, version must be 12.2.0.1.19 or later

. oraenv

Enter database instance e.g. oradb01

·        As oracle user enter

$ORACLE_HOME/OPatch/opatch version

export GI_UPDATE=/u99/media/apr20


Validate the Oracle inventory

$ORACLE_HOME/OPatch/opatch lsinventory -detail -oh $ORACLE_HOME

If this command works save the output so you have a status prior to patch apply.

Run the Opatch conflict check FROM README.
For Database home, as home user:

$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir $GI_UPDATE/30920127/30886680

$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir $GI_UPDATE/30920127/30882603


Check if enough free space is available on the db home filesystem for the patches to be applied as given below:

Create file /tmp/patch_list_dbhome.txt with the following content FROM README :

                   vi /tmp/patch_list_dbhome.txt
                        /u99/media/apr20/30920127/30886680
                         /u99/media/apr20/30920127/30882603
                       
$ORACLE_HOME/OPatch/opatch prereq CheckSystemSpace -phBaseFile 
/tmp/patch_list_dbhome.txt
 Stop the OEM agent (also other apps, Golden Gate etc - outside scope of this doc)

·        As the oracle user

agentbin

               ./emctl stop agent

               exit

·        sudo su – grid

. oraenv

Enter the ASM instance e.g. +ASM

export GI_UPDATE=/u99/media/software/database/apr20qtrupdate

/u01/app/12.2.0/gridhome/OPatch/opatch version

Validate the Oracle inventory

/u01/app/12.2.0/gridhome/OPatch/opatch lsinventory -detail -oh /u01/app/12.2.0/gridhome

If this command works save the output so you have a status prior to patch apply.

Run the Opatch conflict check

For Grid Infrastructure Home, as home user FROM README:

$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir $GI_UPDATE/30920127/30886680

$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir $GI_UPDATE/30920127/30882603

$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir $GI_UPDATE/30920127/30869447

$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir $GI_UPDATE/30920127/26839277

$ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir $GI_UPDATE/30920127/30888810

Check if enough free space is available on the grid home filesystem for the patches to be applied as given below:

Create file /tmp/patch_list_gihome.txt with the following content:

vi /tmp/patch_list_gihome.txt FROM README
 
                    /u99/media/software/database/apr20/30920127/30886680
/u99/media/software/database/apr20/30920127/30882603
/u99/media/software/database/apr20/30920127/30869447
/u99/media/software/database/apr20/30920127/26839277
/u99/media/software/database/apr20/30920127/30888810

Run the opatch command to check if enough free space is available in the Grid Infrastructure Home:

               $ORACLE_HOME/OPatch/opatch prereq CheckSystemSpace -phBaseFile /tmp/patch_list_gihome.txt
 
If OPatch reports Prereq "checkSystemSpace" passed., then no action is needed. 
Proceed with patch installation.



If the version of opatch is not satisfactory download the relevant version of Opatch to a temporary directory, for each database home and the grid home that are being patched, run the following commands as the home owner to extract the Opatch utility

5       Run the following commands as the home owner to extract the opatch utility

unzip <OPATCH-ZIP. -d /u01/app/oracle/product/12.2.0/dbhome

/u01/app/oracle/product/12.2.0/dbhome /Opatch/opatch version


6       Verify the patch


This must be run as the root user

·        sudo su –

. oraenv

Select the grid instance i.e. +ASM

export GRID_HOME=$ORACLE_HOME

export PATH=$PATH:$GRID_HOME

export GI_UPDATE=/u99/media/apr20/30920127

$GRID_HOME/OPatch/opatchauto apply $GI_UPDATE -analyze 

exit

    
7        Apply the Apr 2020 Release Update

The Opatch utility has automated the patch application for the Oracle Grid Infrastructure (GI) home and the Oracle RAC database homes. It operates by querying existing configurations and automating the steps required for patching each Oracle RAC database home of same version and the GI home.

The utility must be executed by an operating system (OS) user with root privileges, and it must be executed on each node in the cluster if the GI home or Oracle RAC database home is in non-shared storage. The utility should not be run in parallel on the cluster nodes.

Depending on command line options specified, one invocation of opatchauto can patch the GI home, Oracle RAC database homes, or both GI and Oracle RAC database homes of the same Oracle release version as the patch. You can also roll back the patch with the same selectivity.

In these instructions we are patching the DB home and grid home together

7.4   Add the directory containing Opatch to the Path environment variable


 export PATH=$PATH:$GRID_HOME/OPatch


7.5   Apply the patch


 opatchauto apply /u99/media/software/database/apr20/30920127


7.6   Run datapatch only on the primary database to update the databases using the db home that has been patched, these changes will then be replicated to the standby servers


Run as the database home user

cd .oraenv

Select the database instance i.e. dbora1


cd $ORACLE_HOME/OPatch

./datapatch -verbose



7.7   To rollback the patch from GI Home

# opatchauto rollback /u99/media/software/database/30920127 -oh <GI_HOME>


7.8   To rollback the patch from database home

# opatchauto rollback /u99/media/software/database/30920127 -oh <ORACLE_HOME>


7.9   Patch post installation instructions

Run the following script to check that the updates have been applied

                      Set lines 200

Col description format a50

                      Col action_time format a30

                      select patch_id, status, action_time, description from dba_registry_sqlpatch

       
7.10                   Start the Enterprise Manager agent       


agentbin

./emctl start agent

Deinstall of Oracle Grid - Metalink 1570358.1

Following a failed upgrade on grid from 12.1 to 12.2 (test box)

Ran the “Upgrade Oracle Grid Infrastructure” from gridsetup.sh which ran successfully; however when running the root.sh script we hit :

The log of current session can be found at:
/u01/app/crsdata/odw01/crsconfig/roothas_2020-04-17_04-23-08AM.log
PROTL-4: Failed to retrieve data from the local registry
2020/04/17 04:23:26 CLSRSC-169: Failed to create or upgrade OLR
Died at /u01/app/product/12.2.0/gridhome_1/crs/install/oraolr.pm line 495.
The command 
'/u01/app/product/12.2.0/gridhome_1/perl/bin/perl -I/u01/app/product/12.2.0/gridhome_1/perl/lib -I/u01/app/product/12.2.0/gridhome_1/crs/install /u01/app/product/12.2.0/gridhome_1/crs/install/roothas.pl ' execution failed

Checked for possible fixes but did not want to run any patches or unsupported fixes without going through Oracle (did not want to cause further probs).

We ran the deinstall tool for the Grid Home 12.2 (which by that point had updated the inventory) which ran successfully. This was to go back to 12.1

However, when we started everything back up the “crsctl stat res -t” was reporting missing “targets” specifically missing ASM diskgroups – very strange. 

When we checked the ASM parameters in the ASM instance, some were empty. 

SQL> show parameter asm

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
asm_diskgroups string DATA
asm_diskstring string /dev/oracleasm/disks
asm_power_limit integer 1
asm_preferred_read_failure_groups string 


The deinstall seems to be more than just remove the s/w gird directory.
More on Metalink here :

How to Reconfigure Oracle Restart on 12c / 12.1 (Doc ID 1570358.1)

Followed this document – back with everything working 😊

Handy Metalink doc to know about if you ever need to remove Oracle grid - hence the note here.



Stopping a node in Oracle RAC - 12c


In an emergency situation where you have very little time to shut down a RAC cluster.
“crsctl stop has” would bring the node down but this does a shutdown abort and can cause issues – especially on a production environment, although oracle does not give any warnings, so best avoided.

So for each node on RAC that needs to be closed - stop Oracle applications like Enterprise Manager and Golden Gate (outside scope of this doc). Note on a multi node RAC env, for example, there maybe GG processes on two nodes but not on the other nodes.

Please check for acfs file systems - outside of scope of this doc.

+++++++++++++++++++++

Run crsctl stat res -t  – copy output, make sure at the end the same command outputs the same.

You will need to set the environment depending on the step.

After each step check status.

++++++++++++++++++++

emctl stop agent

++++++++++++++++++++

Oracle user – RDBMS env.

srvctl stop listener -n node1

++++++++++++++++++++

The instance shutdown depends on how the instance is managed – we are automatic.


[oracle@server01 ~]$ srvctl config database -db oradb -all

Database unique name: abp

…..

Management policy: AUTOMATIC


So – we are using policy managed so :

$ srvctl stop instance -db oradb -node node1 (replace db and node name as approp.)

Stop ALL instances on node you want to stop.

++++++++++++++++++

ASM env. i.e. Grid

When instances are down you can then stop ASM

srvctl stop asm -n node1

++++++++++++++++++

srvctl stop nodeapps -node1 -f

++++++++++++++++++

In this command all CRS related process will be stopped. This is the only command which needs to be executed by "root" user on all database nodes.

crsctl stop crs

+++++++++++++++++

Check cluster down :

crsctl stat res -t

+++++++++++++++++

Startup – this in reverse but using start.

++++++++++++++++++++

More here :




**NB**

Disable autostart of CRS in order to be able to perform server-maintenance and CRS is NOT to be restarted after a reboot of the server (as user root)

Root] # . oraenv

ORACLE_SID = [+ASM] ? +ASM1
The Oracle base has been set to /u01/app/oracle

# crsctl disable crs
CRS-4621: Oracle High Availability Services autostart is disabled.


Further notes  --

Create blackout on OEM and stop relevant apps – outside scope of doc.

Run a ‘ps -ef | grep smon’ to check what instances are running.

Take output from ‘crsctl stat res -t’

Set environment to appropriate oracle home and begin shutdown of node.

[oracle@server01 ~]$ srvctl stop listener -n server01

[oracle@server01 ~]$ srvctl stop service -db oradb -node server01

[oracle@server01 ~]$ srvctl stop instance -db oradb -node server01

[oracle@server01 ~]$ srvctl stop service -db ods -node server01

[oracle@server01 ~]$ srvctl stop instance -db ods -node server01

[oracle@server01 ~]$ srvctl stop asm -n server01 -f

[oracle@server01 ~]$ srvctl stop listener -listener ASMNET1LSNR_ASM -node server01

[oracle@server01 ~]$ srvctl stop nodeapps -node server01 -f

PRCR-1014 : Failed to stop resource ora.net1.network

PRCR-1065 : Failed to stop resource ora.net1.network

CRS-2670: Unable to start/relocate 'ora.net1.network' because 'ora.qosmserver' has a stop-time 'hard' dependency on it

CRS-0245:  User doesn't have enough privilege to perform the operation

--will run further investigation on this.

Logout

So up the priv to root for this node – need to be root to stop crs anyway.

[root@server01 ~]# . oraenv

ORACLE_SID = [root] ? +ASM1

The Oracle base has been set to /u01/app/grid

[root@server01 ~]# srvctl stop nodeapps -node server01 -f

PRCC-1017 : ons was already stopped on server01

PRCR-1005 : Resource ora.ons is already stopped

[root@server01 ~]# crsctl stat res -t | more

--------------------------------------------------------------------------------
Name           Target  State        Server                   State details
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ASMNET1LSNR_ASM.lsnr
               ONLINE  ONLINE       usnympmtaora03b          STABLE
               OFFLINE OFFLINE      server01          STABLE
……etc

ora.oradb.ec.svc
      1        ONLINE  ONLINE       usnympmtaora03b          STABLE
      3        OFFLINE OFFLINE                               STABLE
      4        ONLINE  ONLINE       usnyspmtaora02b          STABLE
ora.oradb.frm.svc

      1        ONLINE  ONLINE       usnyspmtaora02b          STABLE


[root@server01 ~]# crsctl stop crs

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'server01'
CRS-2673: Attempting to stop 'ora.crsd' on 'server01'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on server 'server01'
CRS-2673: Attempting to stop 'ora.chad' on 'server01'
CRS-2673: Attempting to stop 'ora.ods_gg_data.odsggdata.acfs' on 'server01'
CRS-2673: Attempting to stop 'ora.umb_gg_data.umbggdata.acfs' on 'server01'
CRS-2677: Stop of 'ora.ods_gg_data.odsggdata.acfs' on 'server01' succeeded
CRS-2673: Attempting to stop 'ora.ODS_GG_DATA.ODSGGDATA.advm' on 'server01'
CRS-2677: Stop of 'ora.ODS_GG_DATA.ODSGGDATA.advm' on 'server01' succeeded
CRS-2677: Stop of 'ora.umb_gg_data.umbggdata.acfs' on 'server01' succeeded
CRS-2673: Attempting to stop 'ora.UMB_GG_DATA.GGDATA.advm' on 'server01'
CRS-2677: Stop of 'ora.UMB_GG_DATA.GGDATA.advm' on 'server01' succeeded
CRS-2673: Attempting to stop 'ora.proxy_advm' on 'server01'
CRS-2677: Stop of 'ora.chad' on 'server01' succeeded
CRS-2677: Stop of 'ora.proxy_advm' on 'server01' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'server01' has completed
CRS-2677: Stop of 'ora.crsd' on 'server01' succeeded
CRS-2673: Attempting to stop 'ora.storage' on 'server01'
CRS-2673: Attempting to stop 'ora.crf' on 'server01'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'server01'
CRS-2673: Attempting to stop 'ora.gpnpd' on 'server01'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'server01'
CRS-2677: Stop of 'ora.drivers.acfs' on 'server01' succeeded
CRS-2677: Stop of 'ora.crf' on 'server01' succeeded
CRS-2677: Stop of 'ora.gpnpd' on 'server01' succeeded
CRS-2677: Stop of 'ora.storage' on 'server01' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'server01'
CRS-2677: Stop of 'ora.asm' on 'server01' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'server01'
CRS-2677: Stop of 'ora.mdnsd' on 'server01' succeeded
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'server01' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'server01'
CRS-2673: Attempting to stop 'ora.evmd' on 'server01'
CRS-2677: Stop of 'ora.ctssd' on 'server01' succeeded
CRS-2677: Stop of 'ora.evmd' on 'server01' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'server01'
CRS-2677: Stop of 'ora.cssd' on 'server01' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'server01'
CRS-2677: Stop of 'ora.gipcd' on 'server01' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'server01' has completed

CRS-4133: Oracle High Availability Services has been stopped.

[root@server01 ~]# crsctl stat res -t

CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

Down as expected.

STARTUP…

Some of the startup is ran when we start nodeapps but for completeness.

[root@server01 ~]# crsctl start crs

. oraenv to one of the databases…

[oracle@server01 ~]$ srvctl start nodeapps -node server01

[oracle@server01 ~]$ crsctl stat res -t

[oracle@server01 ~]$ srvctl start asm -n server01

[oracle@server01 ~]$ srvctl start listener -listener ASMNET1LSNR_ASM -node server01

[oracle@server01 ~]$ srvctl start instance -db oradb -node server01

[oracle@server01 ~]$ srvctl status database -db oradb -v

[oracle@server01 ~]$ srvctl start instance -db ods -node server01

[oracle@server01 ~]$ srvctl status database -db ods -v

[oracle@server01 ~]$ srvctl start listener -n server01

[oracle@server01 ~]$ emctl status agent

[oracle@server01 ~]$ emctl start agent

[oracle@server01 ~]$ crsctl stat res -t (check everything running)

[oracle@server01 ~]$ crsctl stat res -t | grep 04x | wc -l (compare with start)