Last week I posted about the importance of keeping the tooling up to date in your ExaDB-C@C, you can check the post here: ExaDB-C@C – Dataguard creation failed due to DBCS RPM mismatch version, what?, and after fixing all those issues I tried to create the dataguard (DG) but it failed again, ¿What was the issue?
Error
After executing the workflow again from the console to enable the Dataguard (enable from the console means create) it failed with the following message:
+-----------+---------------------------------------------------------------------------------------------------------------------------------------------+
| EXCEPTION | DETAILS |
+-----------+---------------------------------------------------------------------------------------------------------------------------------------------+
| CDG-50638 | Standby Environment already has the CREG resource name being used |
| | Check errors reported in logs |
| dg_api | CDG-50107 : DataGuard prechecks failed for stage VERIFY_DG_STANDBY |
| | Refer the exceptions raised and fix the issues |
| | File: dg_api, Line#: 1749, Log: /var/opt/oracle/log/XXXX/dbaasapi/db/dg/dbaasapi_VERIFY_DG_STANDBY_2024-01-26_10:05:40.869076_166150.log |
And this happened because the previous execution failed due to DBCS RPM version mismatch.
The automation workflow is supposed to do some post work, and there are actually some cleanup tasks, but sometimes things do not work as we expect, or how the programmers expect as well. You know, sh..t happens!
In these cases, the first thing you should do is open a case in my Oracle Support
Anyway, after doing some research, I did a couple of things to solve the issue, but again… please open a case with Oracle Support
Monitoring
So, where did I get all the previous information?
When you execute a DG workflow through OCI console or the API, the logs and state are written here: /var/opt/oracle/cstate/* and you can find more information as well in the following path: /var/opt/oracle/log/<DBNAME>/dbaasapi/*
In the /var/opt/oracle/cstate/* you will find some XML files with a naming convention like this:
- cstate_<UNIQUE_ID…>.prog.xml
- prog means that task is in progress
- cstate_<UNIQUE_ID…>.suc.xml
- suc means that task has completed successfully
- cstate_<UNIQUE_ID…>.fail.xml
- fail means that task has failed
These files are generated on the primary and standby servers. If you open or check those files, you can find valuable information about what the workflow is doing and, most importantly, the log file location as well.
Fixing the issue
So, this was the exception: CDG-50638 | Standby Environment already has the CREG resource name being used
What is this “CREG” resource?
It is just a file!, the name is <dbname>.ini and you can find it in: /var/opt/oracle/creg/ but there are others things you should be aware of
On primary
To perform the cleanup manually I did the following in the primary cluster using the Data Guard Deployer (DGdeployer)
/var/opt/oracle/ocde/assistants/dg/dgcc -dbname <DBNAME> -action delete
I got the following messages:
Warning: No default value specified for parameter -starterdb
Warning: No default value specified for parameter -cpParams
Invoke DeconfigDGUse of uninitialized value in string eq at /var/opt/oracle/ocde/assistants/dg/dgcc.pm line 1306.
Successfully deconfigured DG
Then I did a backup of the CREG file <dbname>.ini in /var/opt/oracle/creg/ and I removed all the entries related to dg_* except dg_config, I leave that one as it was. After removing all the entries you should get a similar output like this:
[root@exaccprimnode1 creg]# grep -i "dg" <DBNAME>.ini
bkup_cfg_dgobs_spec=dgobscfg.spec
dg_config=yes
You need to modify the CREG file in all the nodes that are part of the primary cluster.
On standby
In all the standby servers I have created a backup of the CREG file <dbname>.ini in /var/opt/oracle/creg and then I have removed them
After that, I ran the following command using the Oracle Cloud Deployment Engine (ODCE):
/var/opt/oracle/ocde/ocde -deldb -exa -dbname <DBNAME>
This is going to do some cleanup of the DG configuration as well in the standby server.
Finally, after completing all those steps, I have executed the workflow to create the DG and it worked without any other issues… Yay!
Conclusion
Creating a Data Guard (DG) should be something straightforward, right? But sometimes it doesn’t work, and finding the root cause can be a challenge as, unfortunately, the OCI console right now does not give enough information about the issue in most cases.
In this post you have found more information about how to monitor the states of the tasks and identify which task has failed and the log file location, so you can try to understand what was the reason for the failure. However, it is imperative to note that in scenarios like these, reaching out to Oracle Support is essential for prompt and comprehensive assistance. Opening a case with Oracle Support can provide valuable insights, guidance, and potential solutions to tackle similar issues effectively in the future.





Leave a comment