Total Page Preview:   000000002188

Cloud Data Engineer SQL Server Interview Questions Answers for freshers and experienced

Question 1. What is the SQL server query execution sequence?
Answer: SQL server query execution sequence below :
  1. FROM -> goes to Secondary files via primary file
  2. WHERE -> applies filter condition (non-aggregate column)
  3. SELECT -> dumps data in tempDB system database
  4. GROUP BY -> groups data according to grouping predicate
  5. HAVING -> applies filter condition (aggregate function)
  6. ORDER BY -> sorts data ascending/descending
Question 2. What is Normalization?
  • Step by step process to reduce the degree of data redundancy.
  • Breaking down one big flat table into multiple table based on normalization rules. Optimizing the memory but not in term of performance.
  • Normalization will get rid of insert, update and delete anomalies.
  • Normalization will improve the performance of the delta operation (aka. DML operation): UPDATE, INSERT, DELETE
  • Normalization will reduce the performance of the read operation: SELECT
Question 3. What are the three degrees of normalization and how is normalization done in each degree?
  • A table is in 1NF when: All the attributes are single-valued.
  • With no repeating columns (in other words, there cannot be two different columns with the same information).
  • With no repeating rows (in other words, the table must have a primary key).
  • All the composite attributes are broken down into its minimal component.
  • There should be SOME (full, partial, or transitive) kind of functional dependencies between non-key and key attributes.
  • 99% of times, it’s usually 1NF.
A table is in 2NF when: 
  • It is in 1NF.
  • There should not be any partial dependencies so they must be removed if they exist.
A table is in 3NF when: 
  • It is in 2NF.
  • There should not be any transitive dependencies so they must be removed if they exist.
  • A stronger form of 3NF so it is also known as 3.5NF
  • We do not need to know much about it. Just know that here you compare between a prime attribute and a prime attribute and a non-key attribute and a non-key attribute.
Question 4. What are the different database objects ?
Answer: There are total seven database objects (6 permanent database object + 2 temporary database object)
  • Permanent DB objects
  • Table
  • Views
  • Stored procedures
  • User-defined Functions
  • Triggers
  • Indexes
Temporary DB object
  • Cursors
  • Temp Table
Question 5. What is collation?
Answer: Bigdata Hadoop: SQL Interview Question with Answers
Collation is defined as set of rules that determine how character data can be sorted and compared. This can be used to compare A and, other language characters and also depends on the width of the characters.
ASCII value can be used to compare these character data.
Question 7. What is a constraint and what are the seven constraints?
Answer: Constraint: something that limits the flow in a database.
  1. Primary key (PRIMARY KEY constraint uniquely identifies each record in a table)
  2. Foreign key (Foreign Key is a database key that is used to link two tables together)
  3. Check (Example: check if the salary of employees is over 50,000)
  4. Default (Example: If the salary of an employee is missing, place it with the default value.)
  5. Nullability (NULL or NOT NULL)
  6. Unique Key (Unique KEY constraint uniquely identifies each record with null value in a table)
  7. Surrogate Key (mainly used in data warehouse)

Question 7.  What are the differences between OLTP and OLAP?

Answer:  click here

Question 8.  What is a Surrogate Key ?
Answer: "Surrogate" means "Substitute".
Surrogate key is always implemented with a help of an identity column.
Identity column is a column in which the value are automatically generated by a SQL Server based on the seed value and incremental value.
Identity columns are ALWAYS INT, which means surrogate keys must be INT. Identity columns cannot have any NULL and cannot have repeated values. Surrogate key is a logical key.
Question 9.  What is a derived column, hows does it work, how it affects the performance of a database and how can it be improved?
Answer: The Derived Column a new column that is generated on the fly by applying expressions to transformation input columns.
Example: FirstName + ' ' + LastName AS 'Full name'
Derived column affect the performances of the data base due to the creation of a temporary new
Execution plan can save the new column to have better performance next time.
Question 10.   What is a Transaction?
Answer:  1- It is a set of TSQL statement that must be executed together as a single logical unit.
Has ACID (Atomicity, Consistency, Isolation and Durability) properties:
Atomicity: Transactions on the DB should be all or nothing. So transactions make sure that any operations in the transaction happen or none of them do.
Consistency: Values inside the DB should be consistent with the constraints and integrity of the DB before and after a transaction has completed or failed.
Isolation: Ensures that each transaction is separated from any other transaction occurring on the
Durability: After successfully being committed to the RDMBS system the transaction will not be lost in the event of a system failure or error.
2- Actions performed on explicit transaction:
BEGIN TRANSACTION: marks the starting point of an explicit transaction for a connection.
COMMIT TRANSACTION (transaction ends): used to end an transaction successfully if no errors were encountered. All DML changes made in the transaction become permanent.
ROLLBACK TRANSACTION (transaction ends): used to erase a transaction which errors are encountered. All DML changes made in the transaction are undone.
SAVE TRANSACTION (transaction is still active): sets a savepoint in a transaction. If we roll back, we can only rollback to the most recent savepoint. Only one save point is possible per transaction. However, if you nest Transactions within a Master Trans, you may put Save points in each nested Tran. That is how you create more than one Save point in a Master Transaction.
Question 11. How do you copy just the structure of a table?
Answer:  SELECT * INTO NewDB.TBL_Structure
FROM OldDB.TBL_Structure
WHERE 1 = 0 -- Put any condition that does not make any sense.
Question 12.What are the different types of Joins?
Answer: INNER JOIN: Gets all the matching records from both the left and right tables based on joining columns.
LEFT OUTER JOIN: Gets all non-matching records from left table & AND one copy of matching records from both the tables based on the joining columns.
RIGHT OUTER JOIN: Gets all non-matching records from right table & AND one copy of matching records from both the tables based on the joining columns.
FULL OUTER JOIN: Gets all non-matching records from left table & all non-matching records from right table & one copy of matching records from both the tables.
CROSS JOIN: returns the Cartesian product.
Question 13. What are the different types of Restricted Joins?
Answer: SELF JOIN: joining a table to itself
RESTRICTED LEFT OUTER JOIN: gets all non-matching records from
left side
RESTRICTED RIGHT OUTER JOIN - gets all non-matching records from
right side
RESTRICTED FULL OUTER JOIN - gets all non-matching records from left table & gets all non-matching records from right table.
Question 14. What is a sub-query?
Answer: It is a query within a query
SELECT <column_name> FROM <table_name>
WHERE <column_name> IN/NOT IN
<another SELECT statement>
Everything that we can do using sub queries can be done using Joins, but anything that we can do using Joins may/may not be done using Subquery.
Sub-Query consists of an inner query and outer query. Inner query is a SELECT statement the result of which is passed to the outer query. The outer query can be SELECT, UPDATE, DELETE. The result of the inner query is generally used to filter what we select from the outer query.
We can also have a subquery inside of another subquery and so on. This is called a nested Subquery. Maximum one can have is 32 levels of nested Sub-Queries.
Question 15. What are the SET Operators?
Answer: SQL set operators allows you to combine results from two or more SELECT statements.
SELECT Col1, Col2, Col3 FROM T2
Rule 1: The number of columns in first SELECT statement must be same as the number of columns in the second SELECT statement.
Rule 2: The metadata of all the columns in first SELECT statement MUST be exactly same as the metadata of all the columns in second SELECT statement accordingly.
Rule 3: ORDER BY clause do not work with first SELECT statement. ? UNION, UNION ALL, INTERSECT, EXCEPT
Question 16. What is a derived table?
Answer SELECT statement that is given an alias name and can now be treated as a virtual table and operations like joins, aggregations, etc. can be performed on it like on an actual table.
 Scope is query bound, that is a derived table exists only in the query in which it was defined. SELECT temp1.SalesOrderID, temp1.TotalDue FROM
(SELECT TOP 3 SalesOrderID, TotalDue FROM Sales.SalesOrderHeader ORDER BY TotalDue DESC) AS temp1 LEFT OUTER JOIN
(SELECT TOP 2 SalesOrderID, TotalDue FROM Sales.SalesOrderHeader ORDER BY TotalDue DESC) AS temp2 ON temp1.SalesOrderID = temp2.SalesOrderID WHERE temp2.SalesOrderID IS NULL
Question 17. What is a View?
Answer: Views are database objects which are virtual tables whose structure is defined by underlying SELECT statement and is mainly used to implement security at rows and columns levels on the base
One can create a view on top of other views.
View just needs a result set (SELECT statement).
We use views just like regular tables when it comes to query writing. (joins, subqueries, grouping )
We can perform DML operations (INSERT, DELETE, UPDATE) on a view. It actually affects the underlying tables only those columns can be affected which are visible in the view.
Question 18. What are the types of views?
Answer: There is three type of view:
  1. Regular View.
  2. Schemabinding View.
  3. Indexed View.
Question 19: What is a Regular View?
Answer: It is a type of view in which you are free to make any DDL changes on the underlying table.
-- create a regular view
Question 20: What is a  Schemabinding View?
Answer: It is a type of view in which the schema of the view (column) are physically bound to the schema of the underlying table. We are not allowed to perform any DDL changes
to the underlying table for the columns that are referred by the schemabinding view structure.
All objects in the SELECT query of the view must be specified in two part naming conventions (schema_name.tablename).
You cannot use * operator in the SELECT query inside the view (individually name the columns)
All rules that apply for regular view.
FROM dbo.T2 -- remember to use two part naming convention
Question 21. What is an Indexed View?
  • It is technically one of the types of View, not Index.
  • Using Indexed Views, you can have more than one clustered index on the same table if needed.
  • All the indexes created on a View and underlying table are shared by Query Optimizer to select the best way to execute the query.
  • Both the Indexed View and Base Table are always in sync at any given point.
  • Indexed Views cannot have NCI-H, always NCI-CI, therefore a duplicate set of the data will be created.
Question  22. What does WITH CHECK do?
  • WITH CHECK is used with a VIEW.
  • It is used to restrict DML operations on the view according to search predicate (WHERE clause) specified creating a view.
  • Users cannot perform any DML operations that do not satisfy the conditions in WHERE clause while creating a view.
  • WITH CHECK OPTION has to have a WHERE clause.
Question 23. What is a RANKING function and what are the four RANKING functions?
Answer: Ranking functions are used to give some ranking numbers to each row in a dataset based on some ranking functionality.
Every ranking function creates a derived column which has integer value.
Different types of RANKING function:
ROW_NUMBER(): assigns an unique number based on the ordering starting with 1. Ties will be given different ranking positions.
RANK(): assigns an unique rank based on value. When the set of ties ends, the next ranking position will consider how many tied values exist and then assign the next value a new ranking with consideration the number of those previous ties. This will make the ranking position skip placement. position numbers based on how many of the same values occurred (ranking not sequential).
DENSE_RANK(): same as rank, however it will maintain its consecutive order nature regardless of ties in values; meaning if five records have a tie in the values, the next ranking will begin with the next
ranking position.
<Ranking Function>() OVER(condition for ordering) -- always have to have an OVER clause
SELECT SalesOrderID, SalesPersonID,
DENSE_RANK() OVER(ORDER BY TotalDue) FROM Sales.SalesOrderHeader
NTILE(n): Distributes the rows in an ordered partition into a specified number of groups.
Question 24. What is PARTITION BY?
Answer: Creates partitions within the same result set and each partition gets its own ranking. That is, the rank starts from 1 for each partition.
Question 25. What is Temporary Table and what are the two types of it? They are tables just like regular tables but the main difference is its scope.
  • The scope of temp tables is temporary whereas regular tables permanently reside. ? Temporary table are stored in tempDB.
  • We can do all kinds of SQL operations with temporary tables just like regular tables like JOINs, GROUPING, ADDING CONSTRAINTS, etc.
  • Two types of Temporary Table
Local (Start with #):
  • #LocalTempTableName -- single pound sign
  • Only visible in the session in which they are created. It is session-bound.
Global(Start with ##):
  • ##GlobalTempTableName -- double pound sign
  • Global temporary tables are visible to all sessions after they are created, and are deleted when the session in which they were created in is disconnected.
  • It is last logged-on user bound. In other words, a global temporary table will disappear when the last user on the session logs off.
Question 26. Explain Variables ?
  • Variable is a memory space (place holder) that contains a scalar value EXCEPT table variables, which is 2D
  • data.
  • Variable in SQL Server are created using DECLARE Statement. ? Variables are BATCH-BOUND.
  • Variables that start with @ are user-defined variables.
Question 27. Explain Dynamic SQL (DSQL). ?
  • Dynamic SQL refers to code/script which can be used to operate on different data-sets based on some dynamic values supplied by front-end applications. It can be used to run a template SQL query against different tables/columns/conditions.
  • Declare variables: which makes SQL code dynamic.
  • Main disadvantage of D-SQL is that we are opening SQL Tool for SQL Injection attacks. You should build the SQL script by concatenating strings and variable.
Question 28. What is SQL Injection Attack?
  • Moderator’s definition: when someone is able to write a code at the front end using DSQL, he/she could use malicious code to drop, delete, or manipulate the database. There is no perfect protection from it but we can check if there is certain commands such as 'DROP' or 'DELETE' are included in the command line.
  • SQL Injection is a technique used to attack websites by inserting SQL code in web entry fields.
Question 29. What is SELF JOIN?
  • JOINing a table to itself
  • When it comes to SELF JOIN, the foreign key of a table points to its primary key. ? Ex: Employee(Eid, Name, Title, Mid)
  • Know how to implement it!!!
Question 30. What is Correlated Subquery?
Answer:  It is a type of subquery in which the inner query depends on the outer query. This means that that the subquery is executed repeatedly, once for each row of the outer query.
In a regular subquery, inner query generates a result set that is independent of the outer query.
FROM HumanResources.Employee E
FROM Sales.SalesPerson S
WHERE S.SalesPersonID = E.EmployeeID)
The performance of Correlated Subquery is very slow because its inner query depends on the outer query. So the inner subquery goes through every single row of the result of the outer subquery.
Question 31. What is the difference between Regular Subquery and Correlated Subquery?
Answer: Based on the above explanation, an inner subquery is independent from its outer subquery in Regular Subquery. On the other hand, an inner subquery depends on its outer subquery in Correlated Subquery.
Question 32. What are the differences between DELETE and TRUNCATE.?
  • DML statement that deletes rows from a table and can also specify rows using a WHERE clause.
  • Logs every row deleted in the log file.
  • Slower since DELETE records every row that is deleted.
  • DELETE continues using the earlier max value of the identity column. Can have triggers on DELETE.
  • DDL statement that wipes out the entire table and you cannot delete specific rows.
  • Does minimal logging, minimal as not logging everything. TRUNCATE will remove the pointers that point to their pages, which are deallocated.
  • Faster since TRUNCATE does not record into the log file. TRUNCATE resets the identity column.
  • Cannot have triggers on TRUNCATE.
Question 33. What are the three different types of Control Flow statements?
  1. WHILE
  2. IF-ELSE
  3. CASE

Question 34. What is Table Variable? Explain its advantages and disadvantages?

Answer: Click here


Question 35. What are characteristics and advantages of stored procedure?

Answer: Click here

Question 36. What are the differences between Temporary Table and Table Variable?

Temporary Table:
  • It can perform both DML and DDL Statement. Session bound Scope
  • Syntax CREATE TABLE #temp
  • Have indexes
Table Variable:
  • Can perform only DML, but not DDL Batch bound scope
  • DECLARE @var TABLE(...)
  • Cannot have indexes
Question 37. What is Stored Procedure (SP)?
Answer:  It is one of the permanent DB objects that is precompiled set of TSQL statements that can accept and return multiple variables.
It is used to implement the complex business process/logic. In other words, it encapsulates your entire business process.
Compiler breaks query into Tokens. And passed on to query optimizer. Where execution plan is generated the very 1st time when we execute a stored procedure after creating/altering it and same execution plan is utilized for subsequent executions.
Database engine runs the machine language query and execute the code in 0's and 1's.
When a SP is created all Tsql statements that are the part of SP are pre-compiled and execution plan is stored in DB which is referred for following executions.
Explicit DDL requires recompilation of SP's.
Question 38. What are the four types of SP?
Answer:  System Stored Procedures (SP_****): built-in stored procedures that were created by Microsoft.
User Defined Stored Procedures: stored procedures that are created by users. Common naming convention (usp_****)
CLR (Common Language Runtime): stored procedures that are implemented as public static methods on a class in a Microsoft .NET Framework assembly.
Extended Stored Procedures (XP_****): stored procedures that can be used in other platforms such as Java or C++.
Question 39. Explain the Types of SP..? 
  • SP with no parameters:
  • SP with a single input parameter:
  • SP with multiple parameters:
  • SP with output parameters:
Extracting data from a stored procedure based on an input parameter and outputting them using output
SP with RETURN statement (the return value is always single and integer value)
Question 40. What is User Defined Functions (UDF)?
  • UDFs are a database object and a precompiled set of TSQL statements that can accept parameters, perform complex business calculation, and return of the action as a value.
  • The return value can either be single scalar value or result set-2D data. ? UDFs are also pre-compiled and their execution plan is saved.

Question 41. What is Difference between Stored Procedure and Function in SQL Server

Answer: Click here


Question 42. What are the different types of Error Handling in SQL Server?

Answer: Click here

Question 43. Explain about Cursors ..?
Answer:  Cursors are a temporary database object which are used to loop through a table on row-by-row basis. There are five types of cursors:
  1. Static: shows a static view of the data with only the changes done by session which opened the cursor.
  2. Dynamic: shows data in its current state as the cursor moves from record-to-record.
  3. Forward Only: move only record-by-record
  4. Scrolling: moves anywhere.
  5. Read Only: prevents data manipulation to cursor data set.
Question  44. What is the difference between Table scan and seek ?
  • Scan: going through from the first page to the last page of an offset by offset or row by row.
  • Seek: going to the specific node and fetching the information needed.
  • Seek: is the fastest way to find and fetch the data. So if you see your Execution Plan and if all of them is a seek, that means it’s optimized.
Question  45. Why are the DML operations are slower on Indexes?
  • It is because the sorting of indexes and the order of sorting has to be always maintained.
  • When inserting or deleting a value that is in the middle of the range of the index, everything has to be rearranged again. It cannot just insert a new value at the end of the index
Question  46. What is a heap (table on a heap)?
Answer: When there is a table that does not have a clustered index, that means the table is on a heap. ? Ex: Following table ‘Emp’ is a table on a heap.
 SELECT * FROM Emp WHERE ID BETWEEN 2 AND 4 -- This will do scanning.
Question  47. What is the architecture in terms of a hard disk, extents and pages?
  • A hard disk is divided into Extents.
  • Every extent has eight pages.
  • Every page is 8KBs (8060 bytes).
Question  48. What are the nine different types of Indexes?
  1. Clustered
  2. Non-clustered
  3. Covering
  4. Full Text Index
  5. Spatial
  6. Unique
  7. Filtered
  8. XML
  9. Index View
Question  49. What is a Clustering Key?
Answer:  It is a column on which I create any type of index is called a Clustering Key for that particular index.
Question  50 . Explain about a Clustered Index.?
  • Unique Clustered Indexes are automatically created when a PK is created on a table.
  • But that does not mean that a column is a PK only because it has a Clustered Index.
  • Clustered Indexes store data in a contiguous manner. In other words, they cluster the data into a certain spot on a hard disk continuously.
  • The clustered data is ordered physically.
  • You can only have one CI on a table.
Question 51. What happens when Clustered Index is created?
  • First, a B-Tree of a CI will be created in the background.
  • Then it will physically pull the data from the heap memory and physically sort the data based on the clustering
  • Then it will store the data in the leaf nodes.
  • Now the data is stored in your hard disk in a continuous manner.
Question  52. What are the four different types of searching information in a table?
  1. Table Scan -> the worst way
  2. Table Seek -> only theoretical, not possible 
  3. Index Scan -> scanning leaf nodes
  4. Index Seek -> getting to the node needed, the best way
Question  53. What is Fragmentation .?
  • Fragmentation is a phenomenon in which storage space is used inefficiently.
  • In SQL Server, Fragmentation occurs in case of DML statements on a table that has an index.
  • When any record is deleted from the table which has any index, it creates a memory bubble which causes fragmentation.
  • Fragmentation can also be caused due to page split, which is the way of building B-Tree dynamically according to the new records coming into the table.
  • Taking care of fragmentation levels and maintaining them is the major problem for Indexes.
  • Since Indexes slow down DML operations, we do not have a lot of indexes on OLTP, but it is recommended to have many different indexes in OLAP.
Question 54. What are the two types of fragmentation?
1. Internal Fragmentation
It is the fragmentation in which leaf nodes of a B-Tree is not filled to its fullest capacity and contains memory
2. External Fragmentation
It is fragmentation in which the logical ordering of the pages does not match the physical ordering of the pages on the secondary storage device.
Question 55. What are Statistics?
Answer: Statistics allow the Query Optimizer to choose the optimal path in getting the data from the underlying table. ? Statistics are histograms of max 200 sampled values from columns separated by intervals.
Every statistic holds the following info:
  1. The number of rows and pages occupied by a table’s data
  2. The time that statistics was last updated
  3. The average length of keys in a column
  4. Histogram showing the distribution of data in column

Top 16 performance query optimization techniques in SQL Database?

Answer: Click here


Question 35 What is ROWID and ROWNUM in SQL?
  1. ROWID is nothing but Physical memory allocation
  2. ROWID is permanant to that row which identifies the address of that row.
  3. ROWID is 16 digit Hexadecimal number which is uniquely identifies the rows.
  4. ROWID returns PHYSICAL ADDRESS of that row.
  5. ROWID is automatically generated unique id of a row and it is generated at the time of insertion of row.
  6. ROWID is the fastest means of accessing data.
  1. ROWNUM is nothing but the sequence which is allocated to that data retreival bunch.
  2. ROWNUM is tempararily allocated sequence to the rows.
  3. ROWNUM is numeric sequence number allocated to that row temporarily.
  4. ROWNUM returns the sequence number to that row.
  5. ROWNUM is an dynamic value automatically retrieved along with select statement output.
  6. ROWNUM is not related to access of data.


Question 55 What is Deadlock?

  • Deadlock is a situation where, say there are two transactions, the two transactions are waiting for each other to release their locks.
  • The SQL automatically picks which transaction should be killed, which becomes a deadlock victim, and roll back the change for it and throws an error message for it.
Question 56. What is CLAUSE?
Answer: SQL clause is defined to limit the result set by providing condition to the query. This usually filters some rows from the whole set of records.
Example - Query that has WHERE condition Query that has HAVING condition.
Question 57. What is Union, minus and Interact commands?
Answer: UNION operator is used to combine the results of two tables, and it eliminates duplicate rows from the tables.
MINUS operator is used to return rows from the first query but not from the second query. Matching records of first and second query and other rows from the first query will be displayed as a result set.
INTERSECT operator is used to return rows returned by both the queries.
Question 58. How to select unique records from a table?
Answer: Select unique records from a table by using DISTINCT keyword.
Select DISTINCT StudentID, StudentName from Student.
Question 59. What is the difference between a connection and session ?
Connection: It is the number of instance connected to the database. An instance is modelized soon as the application is open again.
Session: A session run queries.In one connection, it allowed multiple sessions for one connection.


Question 60. Explain Execution Plan.?
Answer:  Query optimizer is a part of SQL server that models the way in which the relational DB engine works and comes up with the most optimal way to execute a query. Query Optimizer takes into account amount of resources used, I/O and CPU processing time etc. to generate a plan that will allow query to execute in most efficient and faster manner. This is known as EXECUTION PLAN.
Optimizer evaluates a number of plans available before choosing the best and faster on available. Every query has an execution plan.
Definition by the mod: Execution Plan is a plan to execute a query with the most optimal way which is generated by Query Optimizer. Query Optimizer analyzes statistics, resources used, I/O and CPU
processing time and etc. and comes up with a number of plans. Then it evaluates those plans and the most optimized plan out of the plans is Execution Plan. It is shown to users as a graphical flow chart that should be read from right to left and top to bottom.

Thank You

About Author

Brijesh Kumar

Database Developer

I have more then 6 years Experience in Microsoft Technologies - SQL Server Database, ETL Azure Cloud - Azure SQL Database, CosmosDB, Azure Data Factory, PowerBI, Web Job, Azure Function, Azure Storage, Web Apps, Powershall and Database Migration On-Premise to Azure Cloud.
LinkedIn :





Facebook User: