Thursday, May 4, 2017

Oracle: How to reactive-ly monitoring db_link timeout.

I thought this is an interesting question that brought up to me on how to reactive-ly responding to when there are packet loss and impacting the main server where application unable to retrieve data through synonym that pointing to db link. Since db link isn't a process and packet loss is outside of the database control and the server hosting the database instance, there isn't really anything can be done about it within the database. There isn't anything to be restarted or retried or anything built in inside Oracle OEM to monitor and reactive-ly responding to packet loss. 


Perhaps install simple script to check remote db_link is alive in the crontab.


#!/bin/ksh
chk_table=`sqlplus -s vc50/password1@orcl<<oem
set heading off;
select count(1) from dual@db_link_name;
oem
`
if [ $chk_table != "1" ]
then
echo $chk_table
echo -e 'Remote server might be down '`uname -n`'' | mail -v -s 'Package loss is happening' admin@company.com
fi



Another option would probably increase the sqlnet.inbound_connect_timeout parameter within sqlnet.ora but that is not exactly a way to monitor. Any other ways of reactive-ly monitoring db link dis-connectivity ?  If you do, please leave me a note. Thanks in advances.

Wednesday, April 26, 2017

MSSQL: CHECKDB found 0 allocation errors and X consistency error



Found corruption error while running DBCC CHECKDB on the SQL Server DB.

Error messages

Msg 8990, Sev 16, State 1, Line 2 : CHECKDB found 0 allocation errors and 10 consistency errors in table 'VPX_EVENT' (object ID 138483572). [SQLSTATE 01000]
Msg 8989, Sev 16, State 1, Line 2 : CHECKDB found 0 allocation errors and 10 consistency errors in database 'VCDB'. [SQLSTATE 01000]
Msg 8958, Sev 16, State 1, Line 2 : repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (VCDB). [SQLSTATE 01000]

This is data loss corruption similar to Oracle’s datafile header corruption. Damage already been done. User can use repair_allow_data_loss feature as the SQL Server error suggested or restore from good backup. If backup available, I would prefer restoring the db or the specific table. The good news is, the VPX_EVENT isn’t a crucial table. It stores events related information.

Here is what can be done on the repair route. Another option is to clone the existing VCDB to a different MSSQL database and test out the procedures. If everything works smoothly (verified with no data loss), perform the same on the production side.


--If user decided to do it in existing db then take backup of the VCDB first
use VCDB
--verify the corruption still exist
DBCC CHECKDB
--this can only be done in single user mode. So, switch it to single user.
ALTER DATABASE VCDB SET SINGLE_USER;
--attempt with REPAIR REBUILD first where it will attempt to repair without data loss.
DBCC CHECKDB ('VCDB', REPAIR_REBUILD);
--To verify if corruption still exist.
DBCC CHECKDB;
--if it does, move on to repair allow data loss feature, this will result of masking out the corruption where there will unrepairable data loss.
DBCC CHECKDB ('VCDB', REPAIR_ALLOW_DATA_LOSS);

Once completed, set the db back to multi users, otherwise no one else can access the database.

ALTER DATABASE VCDB SET MULTI_USER;


DBA should find out what causes this corruption. This typically results of disk issues.


Tuesday, March 14, 2017

VC6.5: DB lock has been detected on the database by the user associated with the provided DSN.



This is caused by either the vCenter schema is not completely created or the schema session is locked by something else. If you know that the possible locked session is all you need to get rid of then do the following query.
Or
Bouncing the database instance took care of it.



SYS> select 'alter system kill session ''' || sid || ',' || serial# || ''';' from gv$session where username ='VPX';
'ALTERSYSTEMKILLSESSION'''||SID||','||SERIAL#||''';'
--------------------------------------------------------------------------------------------------------------
alter system kill session '240,4292';

SYS> alter system kill session '240,4292';
System altered.