Prep Developing SQL Databases (Exam 70-762)

Home

Get App

Create

What is exam 70-762 and what topics does it cover?
Broken into 4 Chapters, it covers
- (a) Design and implement database objects including tables, schemes, indexes, views, and columnstores
- (b) Design objects relating to data integrity and constraints, stored procedures, triggers, and user-defined functions
- (c) Manage database concurrency through transactions, isolation levels, locking behavior, and memory-optimized tables and native stored procedures
- (d) Using tools to optimize object and infrastructure including statistics and maintenance tasks, analyze and troubleshoot query plans, manage performance and workload configurations, and monitor and trace performance metrics.
What is normalization?

Database normalization is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. There are 6 normal forms and each one must be fulfilled prior to moving into the next normal form.
What are the first 3 normal forms
- (1) Every column in a table represents a single value (ie, no arrays).
- (2) Functional dependence: All columns are related to the table’s primary key (ie, Leads wouldn't have a column called OrderItem)
- (3) Non-Transitive dependence: Columns can't have transitive relationships between other columns, meaning each column must be solely independent and not relative to the other. For example, if we had a DOB field with a Age field, this would be transitive because Age is dependent on dob such that one is dependent on the other.
- (4) AKA Boyce-Codd Normal Form: Says that each candidate key must be unique, even if the key is a composite key and this key represents the whole row as a set. This is a subset of 3rd normal form.
Write a CREATE TABLE command for a table called Leads which has 5 columns: Id which is a non-null primary key which is also unique and starts at 1 and is database generated, a non-null unicode FullName column, a DOB field containing a default value which can only be a date, a Gender column containing a single character which can only be 'M' or 'F' and allows Null values, and LeadType field which is a foreign key to a table called LeadTypes.
- CREATE TABLE Leads6
- (
- Id INT IDENTITY(1,1) CONSTRAINT PK_Leads_7 PRIMARY KEY,
- FullName NVARCHAR(50) NOT NULL,
- DOB DATE NOT NULL DEFAULT (GETDATE()),
- Gender CHAR(1) NULL CONSTRAINT [CHECK_GENDER7] CHECK (([Gender]='F' OR [Gender]='M')),
- LeadType INT CONSTRAINT [FK_Leads_LeadType_Id7] FOREIGN KEY REFERENCES LeadType(Id)
- )
Write code to
(a) drop a foreign key from a table.
(b) add a new column with a new foreign key
- ALTER TABLE dbo.Leads
- DROP FK_LEADS_ORDERID
- ALTER TABLE dbo.Leads
- LeadType INT CONSTRAINT [FK_Leads_LeadType_Id7] FOREIGN KEY REFERENCES LeadType(Id)
What are the numerical SQL types? Give approximate values for each.
- (a) bit = (2^1) = 1,0, Null
- (b) tinyint (2^8) = 0-255
- (c) smallint (2^16 / 2) = -32,768 to 32,767
- (d) int (2^32 / 2) = -2.14 billion to 2.14 billion
- (e) bigint (2^64 / 2) = 9.2 quadrillion to 9.2 quadrillion
- (f) decimal(p, s) = (10^38), fixed precision & scale
- (g) smallmoney = 214,748.3648 = 4 bytes, 4 precision
- (h) money = 9.2 trillion, = 8 bytes, 4 precision
- (i) float(n) = -1.79E
- (j) real = -3.40E
What are the datetime values in SQL? Which is used for time zones? How would we use date() for a birthdate?
- (a) date: 1/1/0001 - 12/31/9999 [3 bytes]
- (b) time(n) [3-5 bytes]
- (c) datetime: 1/1/1753- 12/31/9999 [8 bytes]
- (f) datetime2: 1/1/0001- 12/31/9999 [6-8 bytes]
- (e) smalldatetime: 1/1/1900-06/06/2079 [4 bytes]
- (d) datetimeoffset: same as datetime2, but doesn't handle daylight savings time
What are the string data types in SQL? Give approximate character or size length.
char(n): Fixed width character string (8,000 characters)
- varchar(n): Variable width character string (8,000 characters)
- varchar(max): Variable width character string (1,073,741,824 characters)
- nchar: Fixed width Unicode string (4,000 characters)
- nvarchar: Variable width Unicode string (4,000 characters)
- nvarchar(max): Variable width Unicode string (536,870,912 characters)
- text: Variable width character string (2GB of text data)
- ntext: Variable width Unicode string (2GB of text data)
- binary(n): Fixed width binary string (8,000 bytes)
- varbinary: Variable width binary string (8,000 bytes)
- varbinary(max): Variable width binary string (2GB)
image: Variable width binary string (2GB)
Would we use binary or varbinary or varbinary(max) to store an image? Why?

We would use varbinary because regular binary is fixed-length and using with a max takes up too much room.
For a name and text field. Which would we use, char, varchar, varchar(max) or nvarchar(n)?

We would use nvarchar(n) for both as char is fixed-length, max is wasteful, and varchar types only accept ascii characters whereas nvarchar accepts unicode which allows us to use special characters for different cultures like spanish, etc.
What are 4 other data types not typically used in SQL but can be under special circumstances?

XML (documents), UniqueIdentifier (creates unique random value), spacial (mapping), and sql_variant (custom data type)
What is a computed column? Add one to a table.
Special column which is computed rather than hold a special data type. For example the column,
- ALTER TABLE dbo.Leads
- ADD FullName AS CONCAT(FirstName, ' ' + LastName)
What is dynamic data masking? What are the 4 types?

Allows you to make/hide data from the user, such as an email address like M*****@***.com.

There are 4 types of masking: default (uses default value), email, random (allows a random value within a range), and partial (replaces center with masked values).
Alter dbo.Leads column called Email and add dynamic masking for default, partial, and emails. Then create a user called 'reader' with no logon and read the results. Revert back to original user when done.
- (a)
- ALTER TABLE Leads6
- ALTER COLUMN Email
- ADD MASKED WITH (FUNCTION = 'partial(3, "*********", 2)';
- (b)
- CREATE USER reader WITHOUT LOGIN;
- GRANT SELECT ON Leads TO reader;
- EXECUTE AS USER = 'reader';
- SELECT * FROM Leads;
- REVERT;
(a) Modify email address column to use data masking.

(b) Modify phone number column with partial data masking.

(c) Modify carCount column with a random value between 4-9
- (a)
- ALTER TABLE Example
- ALTER COLUMN EmailAddress
- ADD MASKED WITH (FUNCTION = 'email()');
- (b)
- ALTER TABLE Example
- ALTER COLUMN PhoneNumber
- ADD MASKED WITH (FUNCTION = 'partial(3,"*****",2)');
- (c)
- ALTER TABLE Example
- ALTER COLUMN CarCount
- ADD MASKED WITH (FUNCTION = 'random(4,9)');
A table without a clustered index is known as what? Explain. What is a hash table?

A heap, that is, it is non-sequential. It may contain a non-clustered index (B-tree like structure) but those are not sequential, rather, the index contains a reference to the actual data elsewhere.

A hash table is a data structure that implements an associative unordered array, a structure that can map to keys/value pairs. A hash table uses a hash function to compute an index, also called a hash code, into an array of buckets or slots, from which the desired value can be found.
Can a primary key contain a Null value? How about a Unique constraint? Foreign key? When we create a primary key, how is that interpreted?

Primary keys can not be null, but unique column and foreign key values can contain null values if not explicitly constrained.

When a primary key is created, the database automatically creates a corresponding unique clustered index... but a primary key and a unique clustered index are different things but are usually similar.
Add an index on the DOB column of Leads table, then drop it.
- CREATE INDEX IX_DOB ON dbo.Leads(dob);
- DROP INDEX dbo.Leads.IDX_Gender;
What is the naming convention for primary key, foreign key, index?

PK_[table]_[key]

FK_[table]_[foreign_key_table]_[foreign_key]

IX_[table]_[key] (index, non-unique)

UX_[table]_[key] (index, unique)
How can we turn on/off printable results to see the time it takes to run a query and to get some basic information like scan/seek, logical reads, physical reads, read-aheads, etc in the messages area?
- SET STATISTICS TIME ON;
- SET STATISTICS IO ON;
SELECT * FROM dbo.Leads;
- SET STATISTICS TIME OFF;
- SET STATISTICS IO OFF;
How do we see a query plan estimates? How should query plans be read?

We see them by highlight them and hitting CTRL-L or Query | Display Estimated Query Plan menu item. We should read them from right to left, the most right side contains the record set which was pulled from the table(s). If we are using a seek then it is quick as we are finding one or a few records using a clustered index as opposed to seeing a scan which means we are querying entire table.
Alter a table to add a foreign key from the Leads table (AddressId) column to the Address (Id) column.
- ALTER TABLE Leads
- ADD CONSTRAINT [FK_LEADS_ADDRESS_ID] FOREIGN KEY (AddressId) REFERENCES dbo.Address(Id);
Name the type of scan/seek which is performed in the following queries.
(a) select * from dbo.Leads; (index on Id)
(b) select * from dbo.Leads; (no index)
(c) select * from dbo.Leads where city = 'Lancaster'; (no index)
(d) select * from dbo.Leads where city = 'Lancaster'; (index)
(e) select * from dbo.Leads where Id = 3; (index)
(f) select * from dbo.Leads L Inner Join Address A on L.AddressId = A.Id;
- (a) select * from dbo.Leads; (index on Id) -- clustered index scan
- (b) select * from dbo.LeadsCopy3; (no index) -- table scan
- (c) select * from dbo.LeadsCopy3 where FirstName = 'Lancaster'; (no index) -- table scan
- (d) select * from dbo.Leads where FirstName = 'Lancaster'; (index) -- clustered index scan/nonclustered index seek
- (e) select * from dbo.Leads where Id = 3; (index) -- clustered index seek
- (f) select * from dbo.Leads L Inner Join Address A on L.AddressId = A.Id; -- clustered in scan(L), clustered index seek(A)
What's the difference between a logical read and a physical read?

A logical read is when the query engine needs to read data. First, it looks in memory. If the page is already in SQL Server's memory, then it uses that. If it can't find it in memory, then that triggers a physical read and the data page is read from disk.
What are Included columns and why would we use them?

If we are scanning a whole table but only use a few columns in those rows (ie, count, total) then we can improve on performance by minimizing the number of columns returned so we don't fetch every column while building our result. This is done using a covering index and is especially helpful if we see a Key Lookup being performed in an execution plan.
Create a covered index on the Sales.Orders table which can be used to include 2 columns that have no indexes on them and whose query plan shows a key lookup which is taking up too much time to run the query.
- CREATE NONCLUSTERED INDEX IX_Contact_Include
- ON Sales.Orders(ContactPersonId)
- INCLUDE (OrderDate, ExpectedDeliveryDate)
You are creating a program which uploads attachments and pictures and decide to use a GUID (global unique identifier) for the identification field. How would you create this column? Why use this over an integer field for the primary key?

A GUID is a 16-byte binary value with a 36-character representation. It can be placed into a field of type UNIQUEIDENTIFIER with a DEFAULT NEWID() statement.

A GUID is unique across every table, database, and server allowing ease in merging or distribution of databases or servers, and it has no essential limit (unlike int) ... on the downside they are very large (36 characters), cumbersome to debug, and are non-sequential.
We have the query below which runs too slowly on two tables with respective primary keys. What likely index or indexes can be created to speed up this query.

SELECT PurchaseOrderId, ExpectedDeliveryDate
FROM Examples.PurchaseOrders
WHERE Exists(SELECT * FROM Examples.PurchaseOrderItems WHERE PurchaseOrderItems.PurchaseOrderId = PurchaseOrders.PurchaseOrderId) AND
PurchaseOrders.OrderDate BETWEEN '2016-03-01' AND '2016-06-30'

(a) CREATE INDEX PurchaseOrderId ON Examples.PurchaseOrderItems(PurchaseOrderId);

(b) CREATE INDEX OrderDate_Incl_ExpectedDeliveryDate ON Examples.PurchaseOrders (OrderDate) INCLUDE (ExpectedDeliveryDate);

(c) CREATE INDEX OrderDate ON Examples.PurchaseOrders (OrderDate);

Items (a) and (b) are definitely needed, however, (c) is questionable as it doesn't appear to be used but is likely a good index candidate.
What is a view? What options are there in creating a view? Why do we use views? Show basic form of view.
View is a single select statement compiled into a reusable object that has no stored table data.

There are 3 options for creating views: schemabinding (prevents base tables from being modified if change affects the view), view_metadata (prevents 3rd party controls from viewing certain table metadata), Encryption (encrypts from certain users from viewing or modifying file without proper permissions)

We use views for hiding data, reformatting data, reporting, security, and providing table like reference of complex queries.

A new use for views is the ability to insert, delete, or update data in the underlying table using the view itself.
- CREATE VIEW SchemaName.ViewName
- WITH SCHEMABINDING, ENCRYPTION, VIEW_METADATA
- AS
- SELECT * FROM Customers;
- [WITH CHECK OPTION]
What is a partitioned view?

Feature which still exists for both backward compatibility and to enable a view object to work across multiple independent federated SQL Servers. It essentially allows a single view to be created wherein it refers to two separate servers in two systems acting as one.

For example, California and Kentucky headquarter offices can display cars sold, combining their two table values into a single view, using the Server.Database name and a UNION ALL statement.
(a) Drop a view called Sales.vOrders,
(b) Create a view called vOrders which uses schemabinding and grouped by OrderDate and ProductId
(c) Create an index for the view, thereby making the view an Indexed View or a Materialized View.
- (a)--Create view with schemabinding.
- IF OBJECT_ID ('Sales.vOrders', 'view') IS NOT NULL
- DROP VIEW Sales.vOrders ;
- GO
- (b) --create view
- CREATE VIEW Sales.vOrders
- WITH SCHEMABINDING
- AS
- SELECT SUM(UnitPrice*OrderQty*(1.00-UnitPriceDiscount)) AS Revenue,
- OrderDate, ProductID, COUNT_BIG(*) AS COUNT
- FROM Sales.SalesOrderDetail AS od, Sales.SalesOrderHeader AS o
- WHERE od.SalesOrderID = o.SalesOrderID
- GROUP BY OrderDate, ProductID;
- GO
- (c) --Create an index on the view.
- CREATE UNIQUE CLUSTERED INDEX IDX_V1
- ON Sales.vOrders (OrderDate, ProductID);
- GO
What are the benefits of a materialized view? What are some of the constraints for building these?

Benefits are faster views and views which are updated whenever the underlying data is updated.

Limitations include not using the * operator (select all), union, except, intersect, subqueries, outer joins, distinct, top, sum(), aggregates against NULLs, count(), schemabinding, and more.
What is columnstore indexes and why would we use them?

Designed specifically for analytical purposes (reports), information is stored vertically in columns, separated from their respective records. In this way, reports on very large data sets can be done in totals such as min(), max(), count(), sum(), and more without concern of other data in other columns.

Instead of B-tree structure of normal tables, columnstores put each column type into row groups of ~1 million records, each column has its respective column segments and any new records are placed into a deltastore structure until it exceeds the ~1 million records and forms a new row group. All data is then compressed from the columns into the columnstore for fast processing, usually seeing 10x or more performance improvements. This is excellent for reports or 'big data'.

To create the columnstore, we can simply choose our table and choose Indexes | New Indexes | New Clustered Columnstore Index. We can have a clustered or non-clustered index.

Clustered columnstore indexes change your table's storage and compress your data considerably, reducing the amount of IO needed to perform queries on very large data sets. Nonclustered columnstore indexes can be added to a rowstore table to allow real-time analytics.
What two other commands can be given to show the query plan information such as type of scan/seek, filtering, logical reads physical reads, estimated rows, estimated CPU usage, ect.
- SET SHOWPLAN_ALL { ON | OFF };
- SET STATISTICS PROFILE { ON | OFF };
What are the 5 constraint types?
- (a) Default (used when none is provided by user)
- (b) unique (unique but may be null)
- (c) check (applies simple predicate to insert or update)
- (d) foreign key (enforce relationship)
- (e) primary key (unique, non-null index and record identifier)
Show how to use UNIQUE in the alteration of a column to make it unique.

Add constraint where DOB forces user to be 18 or older.

Add constraint where GENDER is either 'Male' or 'Female'.
- ALTER TABLE dbo.Leads
- ADD CONSTRAINT [AK_Address] UNIQUE (Address);
- ALTER TABLE dbo.Leads
- ADD CONSTRAINT [CHECK_DOB] CHECK (DATEDIFF(YEAR, DOB, GETDATE()) > 18);
- ALTER TABLE dbo.Leads
- ADD CONSTRAINT [CHECK_GENDER] CHECK ([Gender] IN ('M','F'));
How do cascading operations work? Give several examples.
Write a statement for a reference key which must exist in a data type table prior to inserting a new record into a table, we want to check for this same type of action and force some behavior.

By default, we perform (a) no action, but we can also perform (b) cascade ON DELETE or UPDATE. The result, if no item is present will either be NULL or DEFAULT.
- CONSTRAINT FK_InvoiceLineItem
- REFERENCES Invoice(InvoiceId)
- ON DELETE CASCADE
- ON UPDATE NO ACTION
What is a CTE? Where can these be used? Give an example.
Called common tabl expressions, they are temporarily named queries which can be used almost anywhere but are especially helpful in hierarchy queries (HR).
- WITH Sales_CTE (SalesPersonID, SalesOrderID, SalesYear)
- AS
- (
- SELECT SalesPersonID, SalesOrderID, YEAR(OrderDate) AS SalesYear
- FROM Sales.SalesOrderHeader
- WHERE SalesPersonID IS NOT NULL
- )
- SELECT SalesPersonID, COUNT(SalesOrderID) AS TotalSales, SalesYear
- FROM Sales_CTE
- GROUP BY SalesYear, SalesPersonID
- ORDER BY SalesPersonID, SalesYear;
What is WITH CHECK and WITH NOCHECK used for?
When creating or altering your table when it already has data within the table, you can add WITH CHECK or WITH NOCHECK to test the existing data to make sure it complies with your new predicate (or alternatively, not check it). Like so,
- ALTER TABLE dbo.Leads WITH NOCHECK
- ADD CONSTRAINT CHECK_DOB CHECK (DATEDIFF(YEAR, DOB, GETDATE()) > 18);
Below are 3 SQL Commands.

CREATE TABLE Example.ScenarioTestType
(
ScenarioTestType varchar(1) not null constraint pcScenarioTestType primary key
);

CREATE TABLE Example.ScenarioTest
(
ScenarioTestId int NOT NULL Primary Key,
ScenarioTestType varchar(10) NULL Constraint CHK_ CHECK ScenarioTestType IN ('Type1', 'Type2');
)

ALTER TABLE Example.ScenarioTest
ADD CONSTRAINT FK_
FOREIGN KEY (ScenarioTestType) REFERENCES Example.ScenarioTestType;

Using above scenario, how many rows are inserted into Examples.ScenarioTest?

(a) INSERT INTO ScenarioTest(ScenarioTestId, ScenarioTestType) VALUES (1, 'Type1');
(b) INSERT INTO ScenarioTestType(ScenarioTestType) VALUES ('Type1');
(c) INSERT INTO ScenarioTest(ScenarioTestId, ScenarioTestType) VALUES (1, 'Type1');
(d) INSERT INTO ScenarioTest(ScenarioTestId, ScenarioTestType) VALUES (1, 'Type2');
(e) INSERT INTO ScenarioTest(ScenarioTestId, ScenarioTestType) VALUES (2, 'Type1');
(f) INSERT INTO ScenarioTests(ScenarioTestId, ScenarioTestType) VALUES (3, 'Type1');
- (a) fails, foreign key constraint
- (b) passes, but not to correct table
- (c) success, we now have 1 record
- (d) fails, violates primary key constraint
- (e) success, we now have 2 records
- (f) fails, wrong table name
What are the 3 types of stored procedures? Describe each.
- (a) Interpreted SQL: basic type we always use, they are compiled and then ran one line at a time.
- (b) CLR procedures: developed in .NET language and have a T-SQL mechanism to call a procedure.
- (c) Natively compiled SQL: These are compiled into C language modules and can access memory optimized tables and natively compiled objects only.
What are the 4 types of options you can specify with stored procedures? Explain each.
- WITH ENCRYPTION: encrypts procedure so it can't be modified using designer or even see parameter list or help functions.
- WITH RECOMPILE: plan is not to be cached, rather recompiled every time it is called
- WITH EXECUTE AS: used to execute with security context
- WITH REPLICATION: Often used with merge or other ETL replication commands (extract, transform, load)
Write a simple stored procedure that update the DOB of a leads record.
- CREATE PROCEDURE dbo.spUpdateLead @id int, @dob date
- AS
- BEGIN
- SET NOCOUNT ON -- don't return number of records updated
- UPDATE dbo.Leads SET DOB = @dob WHERE Id = @id;
- END
Write a simple stored procedure that update the DOB of a leads record and then returns two values: Gender, City.

Then write a call to execute the procedure and output the results.
- CREATE PROCEDURE dbo.spUpdateLead6 @id int, @dob date, @gender char(1) output, @city varchar(50) output
- AS
- BEGIN
- UPDATE dbo.Leads SET DOB = @dob WHERE Id = @id;
- SELECT @gender = gender, @city = 'Lancaster' FROM dbo.Leads WHERE Id = @id;
- END
- declare @gender char(1), @city varchar(100);
- exec dbo.spUpdateLead6 2, '1971-01-10', @gender output, @city output
- select @gender, @city;
Write the syntax for a trigger. What is the NOT FOR REPLICATION option used for?
- CREATE TRIGGER [schema_name.]trigger_name
- ON table_name
- AFTER {[INSERT],[UPDATE],[DELETE]}
- [NOT FOR REPLICATION]
- AS
- sql_statements
The NOT FOR REPLICATION option instructs the server not to fire the trigger when data modification is made as part of a replication process (ie, ETL).
Write a trigger which inserts into an audit table products after they've been inserted or deleted from a table.
- CREATE TRIGGER dbo.tgrLeadsAudit
- ON Leads
- AFTER INSERT, DELETE
- AS
- BEGIN
- SET NOCOUNT ON;
- INSERT INTO LeadsAudit (id, firstname, dob, gender, ssn, age, actiontype)
- SELECT i.id, i.FirstName, i.DOB, i.Gender, i.SSN, i.Age, 'INSERT' FROM Inserted i
- UNION ALL
- SELECT d.id, d.FirstName, d.DOB, d.Gender, d.SSN, d.Age, 'DELETE' FROM Deleted d;
- END
The procedure below is poorly written. Why? Rewrite it to improve performance.

CREATE PROCEDURE dbo.spSearchDate1 @dob DATE
AS
BEGIN
SELECT * FROM dbo.Leads WHERE CAST(dob AS DATE) = @dob;
END
It is poorly written because DOB has the CAST operation performed on it such that, even if we had an index on the column for faster processing, it couldn't be used because the field is first cast and then compared, meaning we would have to scan entire table rather than perform a simple seek in the table.
- CREATE PROCEDURE dbo.spSearchDate2 @dob DATE
- AS
- BEGIN
- SELECT * FROM dbo.Leads WHERE DOB = @dob;
- END
Show how to throw an error. Show how to raise an error. What's the difference? Is there another way to catch find errors?
- (a) THROW 5000, 'This is an error', 1;
- (b) RAISERROR('This is an error', 16, 1);
- (c) While you can change the state on both (1), you can only change the error severity (16) on raiserror but most importantly, throw stops a batch whereas raiserror continues so all lines after keep getting processed.
- (d) You can also use @@ERROR <> 0 directly after a statement to determine if the statement created an error or not.
Using a TRY/CATCH block is better than using a @@ERROR <> 0 statement. Create a TRY/CATCH below and display the error to the user if one exists.
- BEGIN TRY
- UPDATE DOB = @dob FROM Leads WHERE ID = @id;
- END TRY
- BEGIN CATCH
- SELECT ERROR_PROCEDURE(), ERROR_MESSAGE(), ERROR_NUMBER(), ERROR_LINE(), ERROR_SEVERITY(), ERROR_STATE();
- END CATCH
Explain how nested transactions work and how we test for them. What statement is used to see if a transaction is in effect? Is there a statement that stops a transaction from continuing if an error is raised?
- (a) We typically write a transaction with BEGIN TRANSACTION, COMMIT TRANSACTION OR ROLLBACK TRANSACTION. However, with nested transactions, a new transaction is begun prior to the earlier one being committed, resulting in a transaction count increase, and we can test this by using the @@TRANCOUNT variable.
- (b) XACT_STATE() <> 0 will test if a transaction has started but not been completed [0 means none, 1 means active]
- (c) Yes, to continue with a batch after a transaction has raised an error, simply use the XACT_ABORT ON statement at the beginning of the procedure.
What is the query below used for?

BEGIN TRY
BEGIN TRAN
BEGIN TRAN
PRINT @@TRANCOUNT;
THROW 50000, 'ERROR', 1
RAISERROR ('ERROR STUFF', 16, 1);
COMMIT TRAN
ROLLBACK TRAN
END TRY
BEGIN CATCH
SELECT XACT_STATE();
SELECT ERROR_PROCEDURE(), ERROR_MESSAGE(), ERROR_NUMBER(), ERROR_LINE(), ERROR_SEVERITY(), ERROR_STATE();
END CATCH

It shows (a) nested transactions (b) try/catch error handling (c) transaction counts (d) how to throw an error or raise an error (e) how to commit or rollback a transaction, and (f) how to test if an active transaction exists.
Your receive the error message below. Why? How do we prevent this type of error?

Msg 266, Level 16, State 2, Procedure spChangeTransactionLevel, Line 0 [Batch Start Line 5]
Transaction count after EXECUTE indicates a mismatching number of BEGIN and COMMIT statements. Previous count = 1, current count = 0.
Msg 3903, Level 16, State 1, Line 8
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION.
(a) We get this error because a query and stored procedure can't communicate with one another for use with transactions (can't use the commit transaction or rollback transaction from procedure when began tran started in query) but we can correctly increment or decrement the @@trancount variable in each.
- (b) CREATE PROCEDURE dbo.spChangeTransactionLevel
- AS
- BEGIN TRANSACTION
- ROLLBACK TRANSACTION
- BEGIN TRANSACTION
- EXEC dbo.spChangeTransactionLevel;
- ROLLBACK TRANSACTION
Is there such a thing as Nested Transactions? Explain.
While the term is pretty common, the truth is, no, we cannot really nest transactions as this goes against ACID compliance in ATOMIC transaction definition.
- - A rollback will roll back ALL open transactions and decrement the @@TRANCOUNT to 0.
- - A commit will pseudo-commit the changes up to the last "begin tran" statement and decrement the @@TRANCOUNT by 1. That is, the "nested" commit isn't actually committed, in fact, if we test this and rollback after the first commit, we can see that no changes ever occurred as ALL transactions were rolled back, including the one we had already committed.
Create a stored procedure with try/catch and a rollback to save transaction point.
- CREATE PROCEDURE dbo.AssignUserToTicket
- (
- @updateAuthor varchar(100),
- @assignedUser varchar(100),
- @ticketID bigint
- )
- AS
- BEGIN
- BEGIN TRANSACTION;
- SAVE TRANSACTION MySavePoint;
- BEGIN TRY
- UPDATE ...
- INSERT INTO ...
- COMMIT TRANSACTION
- END TRY
- BEGIN CATCH
- IF @@TRANCOUNT > 0
- BEGIN
- ROLLBACK TRANSACTION MySavePoint;
- END
- END CATCH
- END;
Write script which will delete a record that throws an error and is caught, but uses XACT_STATE method to determine how to handle the error.
- USE AdventureWorks2012;
- GO
SET XACT_ABORT ON;
- BEGIN TRY
- BEGIN TRANSACTION;
- -- A FOREIGN KEY constraint exists on this table.
- DELETE FROM Production.Product WHERE ProductID = 980;
- COMMIT TRANSACTION;
- END TRY
- BEGIN CATCH
- IF (XACT_STATE()) = -1 -- Uncommittable transaction, if we attempt to commit we will get an error
- ROLLBACK TRANSACTION;
- IF (XACT_STATE()) = 1
- COMMIT TRANSACTION; -- Committable transaction, is valid, simply commit
- IF (XACT_STATE()) = 0 -- No open transactions, throw an error or simply exit
- THROW;
- END CATCH;
- GO
What is an INSTEAD OF trigger? How would we use it and why?

This is a specific type of trigger that allows you to skip an INSERT, DELETE, or UPDATE statement to a table or a view and execute other statements instead.

This can be useful for prepping the data before insertion or perhaps logging a delete before the delete actually occurs.
Define DML, DDL, and Logon Triggers.
- (a) Data Manipulation Language (DML) are triggers which react to INSERT, DELETE, or UPDATE data changes.
- (b) Data Definition Language (DDL) triggers are used to react to changes in the definition (schema) of objects, such as CREATE TABLE, ALTER PROCEDURE, etc.
- (c) Logon Trigger react to a person or entity logging into the server. These triggers just stop a login action, so can't be created to force someone off once they are already in the system.
Write 3 code statements
(a) Create a login called MNobes with password = 'Password1'
(b) Create a database user using the login.
(c) Give this user permission to alter any database.
(d) Give user the ability to execute a stored procedure on dbo
(e) Give user ability to execute ALL stored procedures
- (a) CREATE LOGIN MNobes WITH PASSWORD = 'Password1';
- (b) CREATE USER MNobes FOR LOGIN MNobes;
- (c) GRANT ALTER ANY DATABASE TO MNobes;
- (d) GRANT EXECUTE ON SCHEMA::dbo TO Mnobes;
- (e) GRANT ROLE sp_admin; GRANT EXEC TO MNobes;
Create a trigger which will log every logon by a specific user called Login_NotAllowed.
- CREATE TRIGGER tgrLogin_NotAllowed
- ON ALL SERVER
- WITH EXECUTE AS 'Login_NotAllowed'
- FOR LOGON
- AS
- IF ORIGINAL_LOGIN() = 'Login_NotAllowed'
- THROW 50000, 'Unauthorized Access', 1
- ELSE
- INSERT INTO Examples.LoginLog(LoginName, LoginTime, ApplicationName)
- VALUES (ORIGINAL_LOGIN(), SYSDATETIME(), APP_NAME());
Create a scalar function that total returns the number of total orders created by a specific user during a particular year. Then show how we would call it.
- CREATE FUNCTION dbo.fnTotalOrdersByCustomer(@customerid CHAR(6), @year INT)
- RETURNS INT
- AS
- BEGIN
- DECLARE @total INT;
- SELECT @total = SUM(unitprice * quantity)
- FROM [order details] a INNER JOIN orders b
- ON a.orderid = b.orderid
- WHERE DATEPART(YEAR, orderdate) = @year AND
- customerid = @customerID;
- RETURN @total
- END
We call it like so select dbo.fnTotalOrdersByCustomer(100, 1997);
What are some of the constraints for using user-defined functions?
- (a) No error handling or use of @@Error or Raiserror
- (b) No transactions
- (c) Can't call stored procedures
- (d) Returns only one value or result set [although we can return a table as one value]
- (e) Can't use XML
Modify a scalar function which returns a purchase order Id value of 13-digits, of the form CPO-0000000123
- ALTER FUNCTION dbo.fn_GetPurchaseOrderId (@id INT)
- RETURNS NVARCHAR(13)
- WITH RETURNS NULL ON NULL INPUT, SCHEMABINDING
- AS
- BEGIN
- RETURN (N'CPO-' + RIGHT('0000000000' + CAST(@id AS VARCHAR), 9));
- END
What does it mean to say a function is deterministic or non-deterministic? Provide an example function for each. Write a statement to test if a function is deterministic or not.
- (a) Deterministic means we will always get back the exact same result every time we run a function vs not getting the same result every time.
- (b) A function which returns all capital letters is deterministic. A function which returns the day of the week it is or the system time, is non-deterministic.
- (c) SELECT OBJECTPROPERTY(OBJECT_ID('dbo.fnTest'), 'IsDeterministic') As IsDeterministic
What is ACID? Define.
ACID (atomic, consistent, isolated, durable) is a set of properties of database transactions intended to guarantee data validity despite errors, power or hardware failures, and other mishaps.

SQL Server guarantees ACID compliance already, however, we can request additional isolation levels, if necessary.
- A: Atomic refers to a single unit of work grouped together, all or nothing.
- C: Consistency refers to the state of the system is always in a valid state, even when a failure occurs, and we can always expect to get the same result (assuming its deterministic in nature)
- I: Isolation refers to behavior as if the transaction were the only interaction with the database during its duration
- D: Durability is one that retains the changes even if the process is stopped or shut down unexpectedly.
The following code is not Atomic. Why? What would we do to make sure it is atomic?

BEGIN TRAN
UPDATE Leads SET Gender = 'M' WHERE Name = 'Tony';
UPDATE Leads SET Gender = 'M' WHERE dob = 'July 2n 2019';
COMMIT

It's not atomic because if the 2nd statement fails whereas the 1st statement succeeded. They did not act as a single unit.

To correct this, we would place a SET XACT_ABORT ON statement prior to the begin transaction statement, forcing all items to be rolled back.
Look at the code below. What is the output created?

SET XACT_ABORT OFF;
GO
BEGIN TRANSACTION;
INSERT INTO t2 VALUES (1);
INSERT INTO t2 VALUES (2); -- Foreign key error.
INSERT INTO t2 VALUES (3);
COMMIT TRANSACTION;
GO

SET XACT_ABORT ON;
GO
BEGIN TRANSACTION;
INSERT INTO t2 VALUES (4);
INSERT INTO t2 VALUES (5); -- Foreign key error.
INSERT INTO t2 VALUES (6);
COMMIT TRANSACTION;
GO

SELECT * FROM t2;
GO

1,3

That is, value #2 was an error, but script kept processing. With items #4,5,6... an error occurred but all transactions were rolled back.
Give 4 examples of ACID compliance or non-compliance, each representing the letter respectfully.

(a) A transaction with two updates, first working and second one failing because of a foreign key constraint. If no rollback was performed, then one update is made (not committed) while the other is not successful (atomic).

(b) When we try to add/delete a record which has an associated foreign key constraint then we are enforcing data integrity (consistency) such that we don't allow this action.

(c) Begin a transaction to update a table (but without a commit transaction statement); then open a new window and try to read from that same table. We can't because the table is locked for the duration of the transaction, that is, the transaction is isolating itself so that no other changes or views of the data can exist until the transaction is complete (isolation).

(d) Transaction logs which are written to prior to being committed, thereby, even during a power outage, the changes can be rolled back or committed after reading the logs.
There are 3 types of transaction types used by SQL. Implicit, explicit, and auto commit transactions. Define each and how do we set them?

(a) SET IMPLICIT_TRANSACTIONS ON; When ON, this means that if @@TRANCOUNT = 0, any of the following Transact-SQL statements begins a new transaction [drop, execute as, delete create, alter, truncate, fetch, etc]. It is equivalent to an unseen BEGIN TRANSACTION being executed first. When OFF, we say the transaction mode is autocommit.

(b) Explicit Transactions: Accomplished by using the typical BEGIN TRANSACTION END TRANSACTION followed by either COMMIT TRANSACTION or ROLLBACK TRANSACTION (explicitly). You have nothing to configure or statements to execute. However, you have several commands which can't be used in an explicit transaction including: CREATE/ALTER/DROP DATABASE | INDEX | CATALOG, BACKUP, RESTORE, RECONFIGURE.

(c) Auto Commit Transactions: the default transaction management mode of the SQL Server Database Engine. Same as (a) above with switch turned to OFF. Every Transact-SQL statement is committed or rolled back when it completes. If a statement completes successfully, it is committed; if it encounters any error, it is rolled back.
Describe how transactions and @@TRANCOUNT work together and how it is incremented or decremented and when a transaction log is completed.

Each BEGIN TRANSACTION increments the @@TRANCOUNT variable and each COMMIT TRANSACTION decrements that counter. The ROLLBACK TRANSACTION resets the @@TRANCOUNT to zero and rolls back all successful statements to the beginning of the first transaction but does not abort the procedure. When @@TRANCOUNT is at zero, SQL Server writes to the transaction log. If the session ends before writing to the transaction log, SQL Server automatically rolls back the transaction.
What is savepoint transactions?
A savepoint is a named location from which a transaction can restart if part of it is conditionally cancelled; that is, you can rollback a transaction to a specific savepoint if a statement doesn't complete successfully.
- BEGIN TRANSACTION
- INSERT ...
- SAVE TRANSACTION StartTran;
- SELECT...
- DELETE...
- SAVE TRANSACTION DeleteTran
- ...
- ...
- ROLLBACK TRANSACTION StartTran;
A high concurrency database should support a high number of simultaneous users and processes that do not interfere with one another while preserving the consistency of the data affected by those changes.

What are four possible concurrency problems that may arise and how do we guard against these problems?
- (a) Dirty reads: ne user reads data that has already been modified but uncommitted by another. After the read, the change is committed and therefor 'dirty'.
- (b) Non-repeatable read: One in which data read twice inside the same transaction cannot be guaranteed to contain the same value. Depending on the isolation level, another transaction could have nipped in and updated the value between the two reads.
- (c) Phantom read: A phantom read occurs when, in the course of a transaction, new rows are added or removed by another transaction to the records being read.
- (d) Lost update: Occurs when two users read and update a single value simultaneously, one will win (but data is lost) as the 2nd user overwrites that value.
In all of these cases SQL Server provides various locks on resources at a transaction level to support concurrency and avoid these types of conflicts.
What are the locks on resources used by SQL Server Lock Manager?

S, U, X, IS, IU, IX, SIX, Sch-M, Sch-S, and BU.

(a) Shared (S): When imposed, will reserve a resource to be available only for reading, which means that any other transaction will be prevented to modify the locked record as long as the lock is active. However, a shared lock can be imposed by several transactions at the same time over the same page or row and in that way several transactions can share the ability for data reading since the reading process itself will not affect anyhow the actual page or row data. In addition, a shared lock will allow a transfer to an update operations, but no DDL changes will be allowed.

(b) Update (U): Similar to an exclusive lock but is designed to be more flexible in a way. An update lock can be imposed on a record that already has a shared lock. In such a case, the update lock will impose another shared lock on the target row. Once the transaction that holds the update lock is ready to change the data, the update lock (U) will be transformed to an exclusive lock (X). It is important to understand that update lock is asymmetrical in regards of shared locks. While the update lock can be imposed on a record that has the shared lock, the shared lock cannot be imposed on the record that already has the update lock

(c) Exclusive (X): when imposed, will ensure that a page or row will be reserved exclusively for the transaction that imposed the exclusive lock, as long as the transaction holds the lock. The exclusive lock will be imposed by the transaction when it wants to modify the page or row data, which is in the case of DML statements DELETE, INSERT and UPDATE. An exclusive lock can be imposed to a page or row only if there is no other shared or exclusive lock imposed already on the target. This practically means that only one exclusive lock can be imposed to a page or row, and once imposed no other lock can be imposed on locked resources (unless explicit NOLOCK command is used).

(d) Intent (IS, IU, IX, SIX): This lock is a means used by a transaction to inform another transaction about its intention to acquire a lock. The purpose of such lock is to ensure data modification to be executed properly by preventing another transaction to acquire a lock on the next in hierarchy object. In practice, when a transaction wants to acquire a lock on the row, it will acquire an intent lock on a table, which is a higher hierarchy object. By acquiring the intent lock, the transaction will not allow other transactions to acquire the exclusive lock on that table (otherwise, exclusive lock imposed by some other transaction would cancel the row lock).

(e) Schema (Sch-M, Sch-S): Sch-M will be acquired when a DDL statement is executed, and it will prevent access to the locked object data as the structure of the object is being changed. SQL Server allows a single schema modification lock (Sch-M) lock on any locked object. In order to modify a table, a transaction must wait to acquire a Sch-M lock on the target object. Once it acquires the schema modification lock (Sch-M), the transaction can modify the object and after the modification is completed and the lock will be released.

The Sch-S will be acquired while a schema-dependent query is being compiled and executed and execution plan is generated. This particular lock will not block other transactions to access the object data and it is compatible with all lock modes except with the schema modification lock (Sch-M).

(f) Bulk Update (BU): This lock is designed to be used by bulk import operations when issued with a TABLOCK argument/hint. When a bulk update lock is acquired, other processes will not be able to access a table during the bulk load execution. However, a bulk update lock will not prevent another bulk load to be processed in parallel. But keep in mind that using TABLOCK on a clustered index table will not allow parallel bulk importing.
Diagram a lock compatibility table for commonly block call requests.
- EXISTING (horizontal): S U X IS IX SIX
- REQUESTs (vertical): S U X IS IX SIX
- S: Y Y N Y N N
- U: Y N N Y N N
- X: N N N N N N
- IS: Y Y N Y Y Y
- IX: N N N Y Y N
- SIX: N N N Y N N
SQL Server uses dynamic lock management to increase concurrency. What are some common objects which can have locks placed on them?

RID (row), key, page, file, table, metadata, database, application.
How does SQL Server manage conflicts between two transactions attempting to change the same data at the same time?

What is pessimistic and optimistic concurrency?

(a) We use isolation levels to manage concurrency conflicts.

(b) Two approaches to concurrency issues, optimistic is one which hopes their aren't too many problems and tries to address these by allowing read access to certain transactions while pessimistic is one which expects their to be problems or demands that no concurrency problem can ever exist (think bank account) such that a lock is placed until the transaction is fully committed.
What are the 5 types of isolation levels for concurrency issues?

(a) Read uncommitted - the least restrictive, it allows data to be read prior to it being committed, ignoring all locks and doing a read, this would be a good candidate for reporting but it also allows for all error types (dirty reads, lost updates, etc).

(b) Read committed - default, uses pessimistic locking prevent others from reading or modifying data until lock is released (prevents dirty reads but not others).

(c) Repeatable read - (database default), here you prevent any data read by one transaction from being modified by another, preventing all read errors except phantoms.

(d) Serializable - the most pessimistic, which uses range locks on the data to prevent both updates and inserts (preventing all error types).

(e) Snapshot - allow read and write operations to run concurrently without blocking one another.

(f) Read committed Snapshot - Similar to both read committed and snapshot, it is set at database level (like snapshot) before setting it for a transaction. It allows a transaction to read and keep reading not just from the start of a transaction but throughout.
Draw a 4 x 4 table showing isolation levels to read phenomena.
Write each of the 6 isolation level commands.
(a) SET TRANSACTION ISOLATION LEVEL READ COMMITTED

(b) SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

(c) SET TRANSACTION ISOLATION LEVEL REPEATABLE READ

(d) SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
- (e) ALTER DATABASE DB SET ALLOW_SNAPSHOT_ISOLATION ON;
- SET TRANSACTION ISOLATION LEVEL SNAPSHOT
- (f) ALTER DATABASE DB SET READ_COMMITTED_SNAPSOT ON;
- SET TRANSACTION ISOLATION LEVEL READ COMMITTED SNAPSHOT
How do we know what isolation level we are in?

We could test for each by opening two sessions and viewing them separately, running queries in each to see what state or actions are performed on each.

For example, we could start a transaction and write to a log in one window and prior to committing those changes, read the same record from the other window. If the query doesn't complete, we know there is a lock on that record so we are possibly in a Read Committed state, we also know we are not in a Read Uncommitted state because we can't read and that isolation level ignores locks. Further, if we haven't set database level isolations required for the snapshot and read committed snapshot levels, then we know we are not in those isolation levels.

Alternatively, we could check the status of the current connection, run DBCC USEROPTIONS. This returns properties about the connection, including its isolation level. Mine states: Isolation Level = read committed
Write a select query using an inner join and making it serializable. What does this type of query do?
- SELECT a.Column1, b.Column2
- FROM TableA a WITH(SERIALIZABLE)
- JOIN TableB b WITH(SERIALIZABLE);
This type of query shows us that the isolation levels can be set at the database level or even at the query level.
How are isolation levels and locks related? What is the difference between an Update lock vs an Exclusive lock?

Each transaction (implicit or otherwise) must be implemented through a series of hierarchy of locks. The life of the row may begin as a Shared (S) lock followed by a Exclusive (X) lock so that we can modify the data.

An exclusive lock means no one can access the resource whatsoever, reading or writing until the transaction is completed and the lock released. In an Update lock, we will be doing an update and thereby changing the lock to an exclusive lock, but before doing this, we can still allow reads.
The goal of isolation levels is to ensure that queries return consistent results while running concurrent processes.

Explain in detail what locks and/or behavior we can expect in each of the isolation levels.

Read Committed: Two types of locks are acquired, a shared (S) lock for reading and then when writing, it uses the exclusive (X) lock. Any changes to the data are not visible to other operations for the during of the (X) lock.

Read Uncommitted: Here, we ignore all locks all read both committed and uncommitted data. While no locks are acquired, we still can't read schema modification locks.

Repeatable Read: Acquires shared (S) locks on the data for the duration of the transaction, therefor reads block write operations in other transactions, which can affect performance as deadlocks become more frequent.

Serializable: Here, we acquire locks for read operations and key-range locks to prevent other transactions from inserting/modifying data until the transaction is complete. This reduces all concurrency problems but likely slows performance due to the numerous locks.

Snapshot: No locks are acquired, consequently, deadlocks and lock escalations occur less frequently and performance is faster and consistency is higher. Snapshot isolation avoids most locking and blocking by using row versioning. When data is modified, the committed versions of affected rows are copied to tempdb and given version numbers. This operation is called copy on write and is used for all inserts, updates and deletes using this technique. When another session reads the same data, the committed version of the data as of the time the reading transaction began is returned.

Read Committed Snapshot: Uses less tempdb space than snapshot, but still uses a copy of the data and releases that copy once a write is complete.
How do we troubleshoot locking issues?

To troubleshoot, we first need to determine which resources are locked, why they are locked and the lock type in effect. We can use DMVs (dynamic management views).

(a) sys.dm_tran_locks: use this DMV to view all current locks, lock resources, and lock modes, and configurations. For troubleshooting blocking situations, we look for CONVERT in the status column since it indicates an isolation level lock was upgraded to a different lock mode and is currently blocked.

(b) sys.dm_os_waiting_tasks (wait tasks): use this DMV to see which tasks are waiting for a resource, use this to diagnose current queries taking too long to process, gives lots of specific details so is the main query we would use to troubleshoot.

(c) sys.dm_os_wait_stats (wait stats): use this to view historical information including how often processes are waiting while locks are taken. This returns several counts and lock times but in a general form such that we can see 32 times we had a shared lock on a particular resource. Coincidently, this is the only DMV which can be reset manually without resetting the server.
What is lock escalation? How would we prevent it?

Lock escalation is the process of converting many fine-grained locks (such as row or page locks) into table locks. Microsoft SQL Server dynamically determines when to perform lock escalation. When making this decision, SQL Server considers the number of locks that are held on a particular scan, the number of locks that are held by the whole transaction, and the memory that is being used for locks in the system as a whole. Typically, SQL Server's default behavior results in lock escalation occurring only at those points where it would improve performance or when you must reduce excessive system lock memory to a more reasonable level.

To prevent lock escalation, we would need to write SQL which doesn't take too many resources/time to complete. For example, we may process a batch file import by breaking up the transactions into smaller quantities, say only updating 1000 records at a time or by using a date field, or other similar means.
What is pessimistic locking vs optimistic locking?

Optimistic locking is when you check if the record was updated by someone else before you commit the transaction. Pessimistic locking is when you take an exclusive lock so that no one else can start modifying the record.

Optimistic locking is going to be much lower overhead than pessimistic locking – grab the item you need, update it quickly and attempt to apply it. If some other actor in the system beat you to it, you can just retry till you succeed.
Create script which will result in the following error, "Transaction was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction."
(a) Write the following code below, making careful not to commit or rollback the transaction.
- BEGIN TRAN
- UPDATE dbo.Leads SET FirstName = 'John' WHERE Id = 2;
- WAITFOR DELAY '00:00:05'
- UPDATE dbo.Test SET fullname = 'Jim' WHERE keyvalue = '63634B0D-1972-4FDF-BDC6-4C54A43D6CB4';
(b) Now open a new session and write the exact same query except reversing the order of the two tables so that dbo.Test is written to first, followed by dbo.Leads.

(c) Now run the first window, then within 5 seconds run the second window. Both will freeze and after about 10 seconds the deadlock will be finished and one session will have the error displayed above while the other window will have 2 records updated.
What is a memory-optimized table? Why are 5-6 reasons why or situations we should use them?

Tables which are loaded twice, once into memory and the other onto disk (IO) for durability reasons. We only use the memory portion of the table unless a failure occurs and we create these tables like any other except using the 'with (memory_optimized = on) option. Further, you can validate if this is a memory optimized table by clicking properties and under the General Tab you can see Options: Memory Optimized (true/false).

We use these for (a) speed because they can easily be ~20% quicker than traditional tables, (b) reduce bottlenecks and deadlocks, (c) are great for highly complex business logic that can eat away resources or in high volume settings where speed is critical (d) real-time data is needed, (e) session state management used in HTTP settings since every single request by a user uses this table to maintain state (f) ETL operations since we are dealing with LOTS of data being moved around quickly.

The real benefit, is when used with natively compiled procedures which then can make use of these tables, increasing speed 300-500% or even more.
How do we create a memory optimized table?

We add the line 'WITH (MEMORY_OPTIMIZED = ON)' as the last line of our create table command.

Of course, creating a table with that can't be done until the database has been configured to allow for this which requires first creating a FILEGROUP which uses memory rather than IO resources. If we open a database property, we can navigate to FILEGROUPS and add a new entry under MEMORY OPTIMIZED DATA. Then under FILES, we add a new database file and for File Type we choose 'FILESTREAM Data' and for FILEGROUP we choose our just created filegroup (ie, INMEMORY).
What are natively compiled stored procedures?

These are stored procedures which can only access memory tables and not IO tables. They are compiled at the time of creation (as opposed to the time of execution) and convert their code to C code and then into machine language, enabling business logic to both execute and access data faster and more efficiently.

Some restrictions are that they cannot access any tables in tempdb, can't use cursors or the CASE/MERGE statement, can't use the SELECT INTO clause, use subqueries in an UPDATE statement, CTEs, EXECUTE WITH RECOMPILE, reference a view, use % or TOP clause, use DISTINCT or aggregate functions, or PIVOTS.
How do we create a natively compiled procedure?

We use the command, 'WITH NATIVE_COMPILATION, SCHEMABINDING' and then wrap the procedure in a BEGIN ATOMIC...END statement and a WITH (statement...) containing the isolation level and language used, such as 'WITH(TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'ENGLISH')
How would we compare the speed of a interpreted procedure vs a natively compiled procedure?

We could create two procedures of each type, the interpreted updating an IO data column 50,000 times and timing it, and then doing the same with a natively compiled procedure but accessing an in-memory table doing the same.

The results of our book test showed that updates using memory were about 550% faster than those which relied on IO resources.

While we could have created a procedure which accesses a memory table, the improvement would have only been about 5% improvement.
What is a covering index and why would we use one?

A covering index is one which returns all the necessary columns required to resolve the query and is often a great solution for Key Lookup and poor running queries that aren't reliant on a primary key.

For example, if we have a query which returns First, Last, and Email and a search on Email, we might create an index with all 3 items, or better yet, create the Email index and use an INCLUDE with both First and Last names as too many indexes means maintaining all those keys whereas this solution only creates one index and allows us to still use the other two values.
Create the best possible index for the Customers table with the following query in mind, assuming a clustered index is already placed on the Id field.

SELECT Id, First, Last FROM Customers WHERE Email = 'a@b.com';
- CREATE NONCLUSTERED INDEX [ix_Customer_Email] ON [dbo].[Customers]
- (
- EMAIL
- )
- INCLUDE(FirstName, LastName)
Drop a memory-optimized table called SupportEvent then recreate it the table again with failover of schema and data.
- DROP TABLE IF EXISTS SupportEvent;
- go
- CREATE TABLE SupportEvent
- (
- SupportEventId int not null identity(1,1) PRIMARY KEY NONCLUSTERED,
- StartDateTime datetime2 not null,
- CustomerName nvarchar(16) not null,
- Priority int null,
- Description nvarchar(64) null
- )
- WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
- go
Memory-optimized tables can have up to 8 non-clustered indexes (MO's can't use clustered indexes) and this limitation was lifted in 2017. Two flavors of MOs exist: durable and non-durable, if non-durable, then no transactions are written to the IO making them much quicker. However, during a catastrophic power failure, all information will be permanently lost. If durable, then the transactions are done in memory and then written to IO after a delay, that is, user gets success but IO hasn't been written to just yet.

What types of indexes can exist on a memory optimized table? Provide two examples.
Hash, Columnstore, Nonclustered B-tree.
- OrderId INT PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT = 100000),
- CustomerCode NVARCHAR(5) NULL INDEX IX_CustomerCode CLUSTERED COLUMNSTORE
When creating memory-optimized tables, how do you manage durability of the data via logging?
We do this implicitly or explicitly in our code. To guarantee full durability (default), simply use the option:

WITH (MEMORY_OPTIMIZED = ON, DURABILITY=SCHEMA_AND_DATA);

Or we can choose non-durable which means we persist schema but not the underlying data, making it quicker since it has less overhead.

WITH (MEMORY_OPTIMIZED = ON, DURABILITY=SCHEMA_ONLY);

One final way to do this is using delayed durability wherein at the database level we alter the database to turn this feature on and then in each transaction we can can set it on. Like so,
- ALTER DATABASE Exam
- SET DELAYED_DURABILITY = ALLOWED;
- BEGIN TRANSACTION
- WITH (DELAYED_DURABILITY = 0N, TRANSACTION ....)
When is it best to use memory-optimized tables with natively compiled stored procedures?
- (a) Applications which require excellent performance
- (b) Queries that execute frequently
- (c) Tasks that must be performed quickly
- (d) Queries with lots of nested loops, aggregation, complex expressions.
Best not to use if we are updating a single row because it takes a lot of overhead.
Alter the database to allow for memory optimized tables.
Preferably we do this in the UI by selecting Database | Properties | FileGroups | Add FileGroup | (create name). We then need to select OK and close page so we don't get an error. We open up again and this time choose Files | Add | (create name) | and then choose Filestream and pick a file name. Alternatively, we can write code below.
- ALTER DATABASE MYDB
- SET AUTO_CLOSE OFF; -- optional
- ADD FILEGROUP mem_data CONTAINS MEMORY_OPTIMIZED_DATA;
- GO
- ALTER DATABASE MYDB
- ADD FILE (NAME = 'MemoryData', FILENAME = 'c:\memory.ndf')
- TO FILEGROUP mem_data;
How do we analyze performance statistics for natively compiled procedures?

Disabled by default, we can enable them on the server by enabling the two procedures.

EXEC sys.sp_xtp_control_proc_exec_stats 1: for our procedure level statistics

EXEC sys.sp_xtp_control_query_exec_stats 1: for our query level statistics.
Explain each of the calls below.

set statistics time on;
set statistics io on;
set statistics profile on;
set showplan_all on;

select * from sys.dm_tran_locks;
select * from sys.dm_os_waiting_tasks
select * from sys.dm_os_wait_stats

dbcc useroptions;

exec sys.sp_xtp_control_proc_exec_stats 1;
exec sys.sp_xtp_control_query_exec_stats 1;

throw 5000, 'This is an error', 1;
raiserror('This is an error', 16, 1);
- set statistics time on; -- see elapsed statistics on messages
- set statistics io on; -- see statistics including scan count, logical reads, physical reads, read-aheads, lob, etc.
- set statistics profile on; --see time estimates of rows, cpu, io, scan/seek type, etc
- set showplan_all on; --show execution information without actually executing the statements
- select * from sys.dm_tran_locks; -- show active transaction locks and current state
- select * from sys.dm_os_waiting_tasks -- show current waiting transactions & locks
- select * from sys.dm_os_wait_stats -- show historical stats for transactions & locks
dbcc useroptions; -- show isolation level, language, db size, dateformat, etc
- exec sys.sp_xtp_control_proc_exec_stats 1; --include native procedure stats
- exec sys.sp_xtp_control_query_exec_stats 1; --include native query stats
- throw 5000, 'This is an error', 1; -- stops processing
- raiserror('This is an error', 16, 1); -- keeps processing (can't be used in natively compiled procedures)
What 4 tools can we use to analyze method plans?
- Profiler (old and famous)
- Trace (deprecated)
- Extended Events
- Query Store
What is SQL Profiler? How do we open it? What is its biggest drawback to use?

GUI to SQL Trace to monitor various events running in the SQL Server instance.

We open it under SSMS under Tools | SQL Server Profiler

A poorly performing server will not run even slowing because SQL Profiler uses lots of resources.
What is SQL Traces? How would we open it? What does it look like?

SQL Trace will do the same thing as SQL Profiler and provide various statistics on the database but without the GUI. Further, we can start the trace and leave it running for a day and then come back to it to review the results.

We can still open it using SQL Profiler | File | Export | Script Trace | SQL 2005 - 2016. We provide a name which is where the file will be located and stored as a *.sql file.

File is a script file of SQL and can be exchanged as such.
What are extended events? Why use them? How can we use them? Go through the steps in creating one.

They are awesome! Lightweight monitoring tool which collects data about inner event operations and helps troubleshoot performance problems. For example, CPU/memory usage, locks, logons, sql queries, and much more.

Great for use because they are resource light, very extensive (lots of them), very portable (can send by email/sql), and provide every piece of information collected by the system. They are the most efficient way to analyze query plans for SQL Server.

We could see that a client has a CPU problem and create an extended event to diagnose the issue, send it to them by email (SQL statement), and then have them install it (via executing SQL statement) to see CPU performance.

In SSMS | Management (folder) | Extended Events | Sessions | New Sessions. Here we can create our name, add our events we'd like to monitor, have results saved to an external file, and have the extended events saved to a *.xel file.
George Patton (General of the US Army during WWII) had a quote that is applicable to query plans. What is that quote?

"A good plan, violently executed now is better than a perfect plan tomorrow." - George Patton.

In other words, when executing a query, the database will first attempt to find a reasonably good execution plan and run it rather than working out every possible way to execute the query, while imperfect, it always gets the correct result but another plan may have been slightly more efficient had we kept searching.
What is a query execution plan? Why do we read it from right-to-left?

Visual representation of the operations performed by the SQL Server engine to return valid data for the query.

We read from right-to-left because often the most resource intensive cost is done on rightmost entities, therefor our biggest performance improvements can often be had here (but not always).
What are query execution plan warnings? suggestions/hints, query cost in batch?
What are the most cost intensive operations found using the query plan? How would we test changes to the query to determine if our changes improved these operations?
- (a) Scan (vs Seek)
- (b) Parallelism (seen as rows in query that eventually come together usually in a merge or nested loop, etc.)
- (c) Implicit conversion (not using indexes)
- (d) Sort
We would use "SET STATISTICS IO, TIME ON;" before and after our query changes to validate if our changes improve performance.
Query plan suggests a missing index should help your query perform faster. What should you do and how would you do it?

Not all suggestions are good, however, if we believe the suggestion is valid, we can simply create our index from scratch or better yet, right click and choose Missing Index Details which actually contains script to create the index suggested.
You run the query plan below and find the query to run too slow and have too many logical/physical reads. How can you improve the query if you were to assume an index exists on each ID field of each table?

SELECT i.AccountsPersonId, o.OrderId, il.Description
FROM Sales.Invoices I
INNER JOIN Sales.InvoiceLines il ON i.InvoiceID = il.InvoiceID
INNER JOIN Sales.Orders o ON o.OrderID = i.OrderID
WHERE i.BillToCustomerID > 100
GO
NOTE: An index can only be made on a single table but return multiple values within that table!

We would consider indexes on the elements being returned, filtered, or joined (see below candidates). However, if we can only pick one query, it would be on the WHERE clause as this affects everything else. Below is list of possible candidates.
- SELECT: i.AccountsPersonId, il.Description
- JOIN: i.InvoiceId (pk), il.InvoiceId (fk), i.OrderId (fk), o.OrderId (pk)
- WHERE: i.BillToCustomerId
- CREATE NONCLUSTERED INDEX [IX_Invoice_Cols]
- ON [Sales].[Invoices] (BillToCustomerId)
- INCLUDE([OrderId], [AccountsPersonId])
You run the query below and have added an index to improve performance; however, you can still see IO times below. Should you create another index? If so, what index?

InvoiceLines: Scans: 1, logical reads 5003, physical reads 0
Invoices: Scans: 1, logical reads 136, physical reads 0
Orders: Scans: 1, logical reads 138, physical reads 0
SELECT i.AccountsPersonId, o.OrderId, il.Description
FROM Sales.Invoices I
INNER JOIN Sales.InvoiceLines il ON i.InvoiceID = il.InvoiceID
INNER JOIN Sales.Orders o ON o.OrderID = i.OrderID
WHERE i.BillToCustomerID > 100
GO

We can see that the InvoiceLines table is having too many logical reads (memory reads). So we want to concentrate our index there.

CREATE NONCLUSTERED INDEX [IX_InvoiceLines_Cols] ON [Sales].[InvoiceLines] ([InvoiceId, [Description])
You see a query plan that uses Parallelism with 8 threads. How would you test to remove these threads?

Change the query so that we force it to use only a single thread.

OPTION(MAXDOP 1)

MAXDOP = max degree of parallelism
What is query store? Where do we find it? What are 3 important query stores? What are some common query store reports?

Tracks query plans, statistics, and historical data to help diagnose performance degradation over time. Available in all editions, identify resource inte3nsive queries, database level feature, query plan and execution history, forcing query plans.

We can use this to tune SQL on the fly, using its history to tune itself.

Highlight your database and select Properties and go to last item in list, Query Store. We can turn on/off and set options such as data flush intervals, statistical collections, and more.

Plan, runtime stats, wait stats

Regressed queries (were running good, but now bad), overall resource consumption, top resource consuming queries, queries with forced plans, high variations (ie, high memory usage followed by zero usage), query wait statistics (most popular), tracked queries.

Author

mateotete

354000

Card Set

Prep Developing SQL Databases (Exam 70-762)

Description

Exam 70-762 Developing SQL Databases

Updated

2021-01-02T13:12:37Z

Show Answers