Lesson #1 - Tables – Indexes Best Practices
- Clustered Columnstore index (Default Primary) - Highest level of data compression. Best overall query performance
- Clustered index (Primary) - Performant for looking up a single to few rows
- Heap (Primary) - Faster loading and landing temporary data. Best for small lookup tables
- Nonclustered indexes (Secondary) - Enable ordering of multiple columns in a table. Allows multiple nonclustered on a single table. Can be created on any of the above primary indexes. More performant lookup queries
- The queries have equality, inequality, or range predicates
- The predicate columns and the ordered CCI columns are the same.
- The predicate columns are used in the same order as the column ordinal of ordered CCI columns.
- Caching of results, Enable caching at DB level then query level - Resultcachehit flag returns the value whether it was reused
- Hash Distribution: Large fact tables exceeding several GBs with frequent inserts should use a hash distribution.
- Round Robin Distribution: Potentially useful tables created from raw input. Temporary staging tables used in data preparation.
- Replicated Tables: Lookup tables that range in size from 100’s MBs to 1.5 GBs should be replicated. Works best when table size is less than 2 GB compressed.
Lesson #3 - Result-set caching
Cache the results of a query from SQL pool storage. This enables interactive response times for repetitive queries against tables with infrequent data changes. The result-set cache persists even if SQL pool is paused and resumed later.
Cache Checks
You can tell if a query was executed with a result cache hit or miss by querying sys.pdw_request_steps for commands where value is like ‘%DWResultCacheDb%’
- Automatically discovers columns containing potentially sensitive data
- It provides a simple way to review and apply the classification recommendations through the Azure portal.
- The sensitive data labels are persisted in the database (metadata attributes) and it audits and detects access to the sensitive data.
- We offer built-in set of labels and information types, however customers can chose to define custom labels across Azure tenant using Azure Security Center
Lesson #5 - Dynamic Data Masking
- Prevent abuse of sensitive data by hiding it from users
- Easy configuration in new Azure Portal
- Policy-driven at table and column level, for a defined set of users
- Data masking applied in real-time to query results based on policy
- Multiple masking functions available, such as full or partial, for various sensitive data categories (credit card numbers, SSN, etc.)
No comments:
Post a Comment