SQLMCM Training – Day 11

Today Jonathan presented on Replication and Paul about Database Snapshots, and his baby called “CHECKDB” – he has been married with CHECKDB for almost 5 years at Microsoft.

Module 8: Replication

  • Distributes data to different locations
  • Offloading reporting to another database
  • Transactional Replication
    • Used in server-to-server scenarios that require high throughput
    • Different Agents
      • Snapshot Agent
      • Log Reader Agent
        • Is also used by CDC & Change Data Capture
        • See log_reuse_wait_desc column in sys.database
        • Only committed transactions are copied to the Distribution Database
      • Distribution Agent
        • Runs on the Distributor for Push Subscriptions
        • Runs on the Subscriber for Pull Subscriptions
        • Copies transactions from the Distribution Databases to the Subscriber Databases
    • Peer-to-Peer Replication maintans copies of data across multiple server
      • If you enable it, you can’t go back!
      • P2P Replication must be set per publication
  • Merge Replication
    • Primarily for mobile and distributed server that have possible data conflicts
    • Starts with a snapshot
    • Incremential changes for data and schema are tracked with triggers
    • Conflict Detection
    • POS (Point Of Sale) appliations
    • Different Agents
      • Snapshot Agent
      • Merge Agent
        • Applies initial snapshots to subscribers
        • Merges incremential changes
        • Detects and resolve conflicts
          • Priority based – highest priority wins
          • SQLCLR/COM- based on complex business rules
    • Only the final changes to the data are applied
      • E.g. multiple Updates
      • Net changes are tracked at publisher and subscriber are merged periodically with conflict detection
    • Needs planning for TempDb & VersionStore
      • Depends on the number of articles
  • Snapshot Replication
    • You can a snapshot of the data delivered to the subscription
    • Can be used as a preparation for Transactional- and Merge Replication
      • For large databases a backup would be a better option
      • You can also use Bulk Copy the initial data
    • Snapshot takes locks when the snapshot is generated
      • Shared locks on the table are acquired
      • RCSI is not supported
  • Distribution database is needed, because there can be several subscribers
    • When you make changes in the Distribution Database, you are not supported by Microsoft
      • Like Indexing
    • Large Distribution Databases should be moved to its own server
  • SQL Server uses Table Scans inside the Distribution database, which can perform badly when the distribution database gets larger and large
    • You can set the retention period
  • Updatable Subscriptions
    • Allows subscribers to replicate changes back to the publisher
  • Database Mirroring & Replication
    • Publisher: Full Support
    • Subscriber. Limited Support
    • Distributor: No Support
      • Failover Clustering is the only High Availability option
  • Failover Clustering & Replication
    • Supported on each Replication role (Publisher, Subscriber, Distributor)
  • Replication Monitor
    • Provides health about the Replication topology
  • Tracer Token
    • Allows you to measure latency

Module 9: Database Snapshots

  • Consistent point of view of database
  • Can be created on a mirror to access database
  • Sparse file as a mapping- and data area
  • SQL Server is pushing 8kb once a time into the snapshot, not Extents
  • Once a page is pushed in the snapshot, it is never pushed again into the snapshot
  • Buffer Pool has an in-memory map which pages are in the snapshot and which are not
    • The first time when the snapshot is accessed the in-memory map is generated
  • When the database is grown, the new pages are never in the snapshot, because they never existed in the snapshot when the database was created
  • When the original page is pushed into the snapshot, the page is latched (the BUF_LATCH structure)
  • 3 different Snapshots of a database means 3 synchronous writes when a page gets updated
    • The original page must be pushed synchronously in each of the 3 snapshots
  • Snapshot on a Snapshot will not work
  • The snapshot stores its own copy of a page in the Buffer Pool, when the page is read
    • This can lead to memory pressure in the Buffer Pool
  • When there are open transactions during the creation of the snapshot
    • Crash Recovery is of the source database is run into the snaphot
    • E.g. When a transaction is active, a rollback of this transaction is occuring in the context of the snapshot
  • When you create a snapshot on a mirror, the consistent point of time of the snapshot is the time, when the last CHECKPOINT operation occured on the mirror
  • First step is a CHECKPOINT operation, you will not get in the snapshot the following:
    • All transactions that have comitted AFTER the CHECKPOINT operation, they are rollbacked
    • All transactions that were running at the CHECKPOINT operation, and were not committed at the time the CHECKPOINT operation occured
  • FILESTREAM file group can’t be included in a snapshot, but it does not prevent creating the snapshot
    • But you can’t REVERT the snapshot, if there is a FILESTREAM file group
  • Snapshot goes suspect, when there is no more space for the snapshot
    • Source database is un-effected, and snapshot can’t be used anymore
  • You can’t detach or restore the source code, when there is a snapshot defined for that database
  • http://blogs.msdn.com/b/psssql/archive/2009/03/04/sparse-file-errors-1450-or-665-due-to-file-fragmentation-fixes-and-workarounds.aspx

  • sys.databases/source_database_id IS NOT NULL return all Snapshots
  • When you begin a transaction and finally rollback the transaction, the changed pages from that transaction stays in the snapshot
  • REVERTing a snapshot sets the log file to 0,5MB and 2 VLFs
  • Database Snapshot Performance Considerations under I/O Intensive Workloads

Module 10: Corruption Detection and Recovery

  • I/O Errors
    • 823: Hard I/O error
      • OS can’t read data from disk
    • 824 :Soft I/O error
      • Page Checksum failure
      • Torn Page Detection
        • Bits are corrected as soon as the pages are read into the Buffer Pool
      • Wrong PageID
    • 825: Read-Retry Error
    • Logged in msdb.suspect_pages
  • Page Protection options doesn’t protect you
    • They just detect corruption
  • Page Checksum is stored in the Page Header
  • When Page Checksum is enabled, nothing is done
    • A page is only checksumed, when the page gets dirty
    • Not available for TempDb since SQL Server 2008
      • You have to enable Checksum explictely on TempDb when upgrading from SQL Server 2005 to SQL Server 2008
  • Automatic Page Repair is available in both Standard and Enterprise Editions
  • CHECKDB doesn’t take any locks
  • Last Known Good (last time when CHECKDB run without any problems) is reported in the SQL Server Error Log, when you start SQL Server
  • EXTENDEND_LOGICAL_ACCESS
    • Indexed Views
    • XML Indexes
    • Spatial Indexes
  • Statistics Blobs are not checked
  • Repair can break foreign-key constraints
  • VLDBs
    • DBCC CHECKFILEGROUP
      • E.g. One partition per each night
  • You can run CHECKTABLE on the system tables
  • REPAIR_REBUILD option need single user mode
  • TABLERESULTS
    • Returns the output as a table
    • Undocumented, because the output can change from release to release
  • Online Index Rebuild reads the old index during the rebuild
    • Offline Index Rebuild also reads almost every time the old index during the rebuild
    • You have to drop and recreate the NCI
    • It could be problematic if the NCI enforces a constraint, in the mean time when the NCI is dropped, users can insert data that is not enforced by the NCI, and afterwards you can’t recreate the NCI
  • DBCC CHECKDB can be run on a suspect database
    • Suspect means that recovery started, but couldn’t be completed, so it is not transactional consistent
    • But you must put it in EMERGENCY mode
  • Not all pages can be single page restored (because not all operations on them are fully logged)
    • Boot page
    • Fileheader page
    • Allocation bitmap (not including IAM pages)
    • Certain pages in hidden, critical system catalogs
    • Logged in msdb.suspect_pages
    • Up to 100 pages can be restored during a Single Page Restore
    • Online Single Page Restore is an Enterprise feature
    • If you have subsequent log backups, they must be also restored
      • You also need to do a tail-log backup and restore it also
    • And the log system knows that only log records for the specific page must be replayed/rollbacked
  • Allocation System belongs to Page-ID 99
  • Repair is going to delete data!
  • Repair is always offline
  • When you repair a Replication, you must re-initialize the Subscriptions
    • Repair is physically, Replication is logically
  • Msg 8921 when TempDb run out of space
  • Do not try to repair system tables!!!
  • Emergency Mode for rebuilding the transaction log
  • XVI32 Hex Editor 😉
  • Damaged PFS page
    • Allocation Order Scan uses the PFS page
      • E.g. SELECT * FROM TableA
    • You can retrieve the data, even when the PFS page is damaged, by using an Clustered Index Scan by ordering by the Clustered Key, or using an index hint that forces an Clustered Index Scan
  • A suspect or recovery pending database can’t be detached on SQL Server 2008, because it can’t be re-attached any more…
    • When you enforce it, it can be detached
    • To attach it again
      • Create a new database with the SAME log
      • Set the database offline
      • Delete the MDF/LDF file
      • Copy the MDF/LDF file from the detached database
      • Set the database online
      • Database is again in the SUSPECT state
  • ATTACH_REBUILD_LOG

Thanks for reading and stay tuned for the last week J

-Klaus