-- * The database administrator’s duties are to:
--   * Protect the database from failure wherever possible
--   * Protect critical components by redundancy
--   * Minimize the loss of data
--   * Increase the mean time between failures (MTBF)
--   * Decrease the mean time to recover (MTTR)


-- * Categories of Failure
--------------------------

--   * Statement failure
------------------------
--   * there are four common causes of statement failure: 
--     1. invalid data - usually a format or integrity constraint violation, 
--        Oracle itself will try to solve formatting problems by doing automatic 
--        type casting to convert data types on the fly, but this is not very efficient and 
--        shouldn’t be relied upon. Constraint violations will be detected, but Oracle can do 
--        nothing to solve them.
--     2. insufficient privileges
--     3. space allocation problems - Space management problems are frequent, 
--        but they should never occur. A good DBA will monitor space usage proactively and take 
--        action before problems arise. Space related causes of statement failure include 
--        * inability to extend a segment because the tablespace is full
--        * running out of undo space
--        * insufficient temporary space when running queries that use disk sorts 
--        * a user hitting his quota limit
--        * or an object hitting its maximum extents limit. 
--        Database Control includes 
--        * the undo advisor
--        * the segment advisor, 
--        * the Automatic Database Diagnostic Monitor 
--        * the alert mechanism
--        which will help to pick up space-related problems before they happen.
--      4.logic errors - Programmers may well develop code that in some circumstances is
--        impossible for the database to execute, for example if the application 
--        attempts to insert a child row before the parent row.

--   * User process failure 
---------------------------
--     A user process may fail for any number of reasons for example: 
--     * the user exiting abnormally instead of logging out 
--     * the terminal rebooting 
--     Whatever the cause of the problem, the outcome is the same : 
--     The PMON background process periodically polls all the server processes to ascertain
--     the state of the session. 
--     If a server process reports that it has lost contact with its user process, 
--     PMON will tidy up. 
--     If the session was in the middle of a transaction, PMON will roll back the 
--     transaction and release any locks. Then it will terminate the server
--     process and release the PGA back to the operating system.
--   * This type of problem is beyond the DBA’s control, but s/he should watch for any
--     trends that might indicate a lack of user training, poorly written software, or perhaps
--     network or hardware problems.

--   * Network failure
----------------------
--     A database listener is unlikely to crash, but there are limits to the amount of work
--     that one listener can do. 
--     A listener can service only one connect request at once, and it does take 
--     an appreciable amount of time to launch a server process and connect it
--     to a user process. 
--     If your database experiences high volumes of concurrent connection requests, users may 
--     receive errors when they try to connect. You can avoid this by configuring 
--     multiple listeners (each on a different address/port combination) and using 
--     connect-time load balancing

--   * User error 
-----------------
--     User: “I forgot to put a WHERE clause on my UPDATE statement, so I’ve just
--            updated a million rows instead of one.”
--     DBA: “Did you say COMMIT?”
--     User: “Oh, yes.”
--     DBA: “Um....”

--   * The ideal solution to user errors is to prevent them from occurring in the first place.
--     This is partly a matter of user training, but more especially of software design: 
--     no user process should ever let a user issue an UPDATE statement without a WHERE clause.
--   * But even the best-designed software cannot prevent users from issuing SQL that is
--     inappropriate to the business. 
--     Oracle provides a number of ways whereby you as DBA may be able to correct user errors,
--     but this is often extremely difficult-particularly if the error isn’t 
--     reported for some time. The possible techniques are 
--     * flashback query 
--     * flashback drop 
--     * the Log Miner 
--     * incomplete recovery 
--     * Flashback Database 

--     * Flashback technology 
--     * Logminer utility 
  
-- DEMO - Flashback query
-------------------------

SELECT last_name, salary
FROM hr.employees;

-- Update and commit (User forgot to put a where condition, and also commited his changes...)
UPDATE hr.employees
SET salary = 0;

commit;

-- Data is changed and cannot be rolled back...
SELECT last_name, salary
FROM hr.employees;


-- Flashback Query - Select from the table as it was at a specific point in the past:
-- In this example - go back one hour:

SELECT last_name, salary
FROM hr.employees AS OF TIMESTAMP sysdate - 1/24;

-- Fix the table with a correlated Update:
UPDATE hr.employees e
SET salary = (SELECT salary from hr.employees AS OF TIMESTAMP sysdate - 1/24
              WHERE employee_id = e.employee_id);


-- Logminer
-----------

-- * All changes made to user data or to the database dictionary are recorded in 
--   the Oracle redo log files so that database recovery operations can be performed.
-- * LogMiner provides a well-defined, easy-to-use, and comprehensive relational interface to redo --   log files. 
--   * it can be used as a powerful data audit tool  
--   * as well as a tool for sophisticated data analysis
--   * it could also be used for undo 

desc v$logmnr_contents

-- * Before Oracle 11g it was accessed using various Oracle supplied packages 
-- * Since 11g it could be easily accessed using the OEM
--   For example : "Home page" => "Schema" => "Tables" => (go) => (actions => log miner)
-- * To extract a LogMiner dictionary to the redo log files, the database must be open and in    
--   ARCHIVELOG mode and archiving must be enabled.



-- * (continue) Categories of Failure
-------------------------------------

--   * Instance Failure
-----------------------
-- * Instance failure occurs when the database instance is shut down before synchronizing all 
--   database files. An instance failure can occur because of hardware or software failure or 
--   through the use of the emergency SHUTDOWN ABORT and STARTUP FORCE shutdown commands.
-- * The Oracle database (SMON) automatically recovers from instance failure. 
--   All that needs to happen is for the instance to be started normally. 


-- * When Oracle Database Performs Instance Recovery
----------------------------------------------------
-- * Whether instance recovery is required depends on the state of the redo threads. 
-- * A redo thread is marked OPEN in the CF when a database instance opens in read/write mode, 
-- * and is marked CLOSED when the instance is shut down consistently. 
-- * If redo threads are marked open in the control file, 
--   but no live instances hold the thread enqueues corresponding to these threads, 
--   then the database requires instance recovery.

-- Oracle Instance Architecture
-------------------------------

-----------------------------
-- * Executing SQL statements
-----------------------------

-- * Executing a SELECT Statement
--   1. The server process executing the statement will first check whether 
--      the blocks containing the data required are already in memory, in the database buffer cache. 
--   2. If they are, then execution can proceed immediately. 
--      If they are not, the server must locate them on disk and copy them into the 
--      database buffer cache.
--   3. Once the data blocks required for the query are in the database buffer cache, 
--      any further processing (such as sorting or aggregation) is carried out in the PGA of the
--      session. 
--   4. When the execution is complete, the result set is returned to the user process.

----------------------------------
-- * Executing an Update Statement
----------------------------------

--   1. The first step in executing DML is the same as executing SELECT: 
--      the required blocks must be found in memory or copied into memory from disk. 
--      The only change is that an empty block of an undo segment is needed too
--   2. Locks must be placed on any rows that will be affected by the operation
--   3. Then the redo is generated: the server process writes to the log buffer the changes
--      that are going to be applied to the data blocks.  
--   4. Having generated the redo, the update is carried out in the database buffer cache:
         A. The block of table data is updated with the new version of the changed column, 
         B. Tnd the old version of the changed column is written to the block of an undo segment. 
--   5. From this point until the update is committed, 
--      all queries from other sessions addressing the changed row will be 
--      redirected to the undo data. 
--      Only the session that is doing the update will see the actual current 
--      version of the row in the table block.

-------------------------------------------
-- * Executing Insert and Delete Statements
-------------------------------------------

-- * Conceptually, INSERT and DELETE are managed in the same fashion as an UPDATE.
--   Redo generation is exactly the same: all changes to be made to data 
--   are first written out to the log buffer. 
-- * The difference is in the amount of undo generated. 
--   INSERT - When a row is inserted, the only undo generated consists of writing out the new rowid
--            to the undo block. This is because to roll back an INSERT, the only information Oracle
--            requires is the rowid, 
--   DELETE - the whole row is written to the undo block, so that the deletion 
--            can be rolled back if need be by inserting the complete row back
--            into the table.

---------------------------------------------------------
-- Buffers => Background Processes => Physical Structures
---------------------------------------------------------

-- * Redo Log Buffer => LGWR => Redo log files 
--   LGWR Writes upon commit and other events
--   Main purpose - guarantee that every transaction with COMMIT is "saved"   

-- * Buffer Cache => DBWR => Data files 
--                   CKPT => checkpoint position in DF and CF 
--   Every time DBWR writes dirty buffers, Check-Point-Position is written 
--   to the control files and data files headers (unless its an Icremental CP) 


-- * Phases of Instance Recovery
---------------------------------
-- 	Startup instance (data files are out of sync)
-- 1.	Roll forward (redo) => Committed and uncommitted data in files	(MTTR)
-- 	Database opened
-- 2.	Roll back (undo)    => Uncommitted data in files

-- Timeline 
---------------> Transaction --> Commit --> Checkpoint ---> Crash 
-- Since Instance Recovery begins after last checkpoint, this transaction 
-- doesnt need to be recovered

---------------> Checkpoint  --> Transaction --> commit --> Crash 
-- SMON will Rollforward this transaction

---------------> Checkpoint  --> Transaction --> Crash
-- SMON will Rollforward and then Rollback this Transaction



-- Redo log files
-----------------

-- * Oracle consists from at least two redo log files
-- * When one RLF is full  => LOG SWITCH => data will be written to the next RLF
--   When last RLF is full => LOG SWITCH => data will overwrite the first RLF
-- * Data is written to the RLF in a circular fashion 
--   What if Chekpoint Position is still pointing at entry in the first RLF for example ? 
-- * In order to overwrite RLF : 
--   * The data in the RLF is not needed for crash-recovery (CHECKPOINT will have to take place).
--   * ARCHIVE MODE (non default mode) - when DB is in archivelog mode the data must be archived. 

-- * That is the reason why in every LOG SWITCH :
--   a. CHECKPOINT will take place 
--   b. RLF will be archived (In ARCHIVE MODE)

-- Redo log files Information : 
set lines 200
SELECT GROUP#, SEQUENCE#, BYTES, MEMBERS, STATUS, FIRST_CHANGE#,
       TO_CHAR(FIRST_TIME,'dd/mm hh24:mi:ss') "FIRST TIME"
FROM v$log
ORDER BY GROUP#

-- SEQUENCE#
------------
-- Every Log switch makes the current Redo Log Sequence = Last Redo Log Sequence + 1 

ALTER SYSTEM SWITCH LOGFILE;

-- Status : 
-----------
-- * CURRENT  - Indicates a valid log that is in use (highest SEQUENCE#)
-- * INACTIVE - Means that the log is no longer needed for instance recovery
-- * ACTIVE   - Means that the log is is still needed for instance recovery

ALTER SYSTEM CHECKPOINT;

-- SCN
------
-- * the system change number (SCN) is Oracle's clock - every time we commit, 
--   the clock increments. The SCN just marks a consistent point in time in the database.
-- * The database ALWAYS has transactions going on, ALWAYS.  
--   SMON and many other background processes are always doing work, the database 
--   (unless it is opened read only) is always doing transactions. 

SELECT checkpoint_change# AS "Checkpoint Position" , 
       current_scn "Last Change"FROM v$database;

-- http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:48052838748707

-- Crash recovery =>  From Checkpoint Position until Last Change

show parameter MTTR 

-- Lower value for the MTTR results higher checkpoint frequency
-- Higher value fro the MTTR results lower checkpoint frequency

SELECT TARGET_MTTR,ESTIMATED_MTTR
FROM v$instance_recovery ;

SELECT RECOVERY_ESTIMATED_IOS,
       ESTIMATED_MTTR,
       TARGET_MTTR,
       WRITES_MTTR
       WRITES_AUTOTUNE
FROM   v$instance_recovery;

-- * RECOVERY_ESTIMATED_IOS The number of read/write operations that would be needed on
--   datafiles for recovery if the instance crashed now

-- * ESTIMATED_MTTR The number of seconds it would take to open the database if it
--   crashed now

-- * TARGET_MTTR The setting of FAST_START_MTTR_TARGET

-- * WRITES_MTTR The number of times DBWn had to write, in addition to the
--   writes it would normally have done, to meet the TARGET_MTTR

-- * WRITES_AUTOTUNE The number of writes DBWn did that were initiated by the
--   auto-tuning mechanism



--------------------------------
-- MEDIA FAILURE
--------------------------------

--------------------------------
-- Control File 
--------------------------------

show parameter control_files 


-- Add another control file: (maximum 8)
----------------------------------------

show parameter control_files 
 
-- 1. Change the parameter:

ALTER SYSTEM SET control_files=
'C:\APP\ADMINISTRATOR\ORADATA\ORCL\CONTROL01.CTL',
'C:\APP\ADMINISTRATOR\FLASH_RECOVERY_AREA\ORCL\CONTROL02.CTL',
'C:\APP\ADMINISTRATOR\ORADATA\ORCL\CONTROL03.CTL' 
scope=spfile;
 
-- 2. shutdown

shutdown immediate 

-- 3. Copy in the OS:

-- 4. startup
 
select name from v$controlfile;

------------------------------------
-- Redo Log File
------------------------------------

-- Add redo log members 
-----------------------

SELECT GROUP#, SEQUENCE#, BYTES, MEMBERS, STATUS, FIRST_CHANGE#,
       TO_CHAR(FIRST_TIME,'dd/mm hh24:mi:ss') "FIRST TIME"
FROM v$log
ORDER BY SEQUENCE# DESC

-- View members :

col member for a40
SELECT GROUP#, MEMBER
FROM v$logfile ;

-- Add one member for each group : 

ALTER DATABASE ADD LOGFILE MEMBER
'C:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO01_2.LOG' TO GROUP 1,
'C:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO02_2.LOG' TO GROUP 2,
'C:\APP\ADMINISTRATOR\ORADATA\ORCL\REDO02_3.LOG' TO GROUP 3;

col member for a40
SELECT GROUP#, MEMBER
FROM v$logfile 
ORDER BY GROUP#;


------------------------------------------------------------------------------------
-- FRA - Flash Recovery Area (since 11gR2 - "F" stands for "fast"...) 
------------------------------------------------------------------------------------
-- * A unified storage location for all backup recoveryrelated files and activities 
--   in an Oracle database.
-- * Simplifies the location of database backups
-- * Automatically manages the disk space allocated for recovery files

show parameter recovery_file
                                                                
-- DB_RECOVERY_FILE_DEST     : Sets the location for the Flash Recovery Area
-- DB_RECOVERY_FILE_DEST_SIZE: Sets the disk limit

--   * If we used up all the space, the DB will be stuck. (Cannot create new archives).
--   * Therefore, he will delete obsolete backups and archives.
--     (According to our retention policy)
--   * This will be mentioned in the Alert.Log
--   * Alerts will fire at 85% (warning) and 97% (critical)
--   * The backup retention policy specifies which backups must be retained to meet your data 
--     recovery requirements. This policy can be based on a 
--     recovery window (the maximum number of days into the past for which you can recover) 
--     or redundancy (how many copies of each backed-up file to keep).


-- Via OEM:
-- Availability => Recovery Settings  (MTTR, ARCHIVELOG, Flash-Recovery-Area)
-- Availability => (Setup) => Recovery Settings  (MTTR, ARCHIVELOG, Flash-Recovery-Area)


-------------------------------------------
-- The necessity for Archivelog and backups 
-------------------------------------------
-- * Consider the following scenario : 
--   A. Sunday => DF Backups : DF1, DF2, DF3 all on SCN 50 
--   B. Monday => All DF on SCN 80, but DF3 (due to some media failure) was destroyed 
-- * Steps in recovering DF3 
--   1. use backup to recover DF3 to SCN 50
--   2. use archives to recover DF3 close to SCN 80
--   3. use redologs to recover DF3 to SCN 80


-- ARCHIVE VS NOARCHIVELOG summary
-------------------------------------
-- * When redo log can be overwritten
--   * ARCHIVE   - checkpoint + archive 
--   * NOARCHIVE - checkpoint 
-- * Backups 
--   * ARCHIVE   - hot  (db is online)/cold and partial/full, possible to recover 
--                 untill last commit
--   * NOARCHIVE - cold (db is offline) and full, no way to recover since last backup 

----------------------------------------
-- ARCHIVELOG mode
----------------------------------------
-- * Archives old redo log information, instead of deleting it.
-- * Enables Hot backup, Partial backup and Complete recovery.
-- * Via the OEM:
--    Availability => Recovery Settings 
--    (Instance Recovery - MTTR, Media Recovery - ARCHIVE Mode, Flash Recovery Settings)


-----------------------------------
-- Place the DB in Archivelog mode:
-----------------------------------

SELECT log_mode FROM v$database;

-- 1. Perform clean shutdown:
shutdown immediate 

-- 2. Startup in MOUNT mode:
startup mount
 
-- 3. Enter ArchiveLog mode:
ALTER DATABASE ARCHIVELOG;

-- 4. Open the database
ALTER DATABASE OPEN;

SELECT log_mode FROM v$database;

-- 5. Perform Whole DB Backup (The first base backup)

SELECT log_mode FROM v$database;