50 SQL Interview Questions Every .NET Developer Should Know
Master SQL Server, query optimization, indexing, transactions, and database design with real interview questions used by companies hiring .NET developers.
Table of Contents
- 1. What is SQL?
- 2. What is a primary key?
- 3. What is a foreign key?
- 4. What is normalization?
- 5. What is denormalization?
- 6. What is the difference between DELETE, TRUNCATE, and DROP?
- 7. What is a JOIN?
- 8. What is the difference between INNER JOIN and LEFT JOIN?
- 9. What is a UNIQUE constraint?
- 10. What is a NULL value?
- 11. What is an Index?
- 12. What is a Clustered Index?
- 13. What is a Non-Clustered Index?
- 14. How do indexes improve performance?
- 15. Can too many indexes hurt performance?
- 16. What is index fragmentation?
- 17. What is a covering index?
- 18. What is a composite index?
- 19. What is query optimization?
- 20. How do you identify slow SQL queries?
- 21. What is a transaction?
- 22. What is ACID?
- 23. What is an isolation level?
- 24. What is Read Uncommitted?
- 25. What is Read Committed?
- 26. What is Repeatable Read?
- 27. What is Serializable?
- 28. What is a deadlock?
- 29. How do deadlocks occur?
- 30. How can deadlocks be prevented?
- 31. What is a Stored Procedure?
- 32. What are the advantages of Stored Procedures?
- 33. What is a View?
- 34. What is a Common Table Expression (CTE)?
- 35. What is a Window Function?
- 36. What is ROW_NUMBER()?
- 37. What is RANK()?
- 38. What is DENSE_RANK()?
- 39. What is a Subquery?
- 40. What is the difference between a Subquery and a CTE?
- 41. What is Database Sharding?
- 42. What is Database Partitioning?
- 43. What is Database Replication?
- 44. What is Eventual Consistency?
- 45. What is CQRS?
- 46. How would you design a database for millions of users?
- 47. How do you optimize large tables?
- 48. How do you handle database migrations?
- 49. How do you troubleshoot production database issues?
- 50. What SQL mistakes do developers commonly make?
1. What is SQL?
SQL (Structured Query Language) is the standard language used to communicate with relational databases. It allows developers to create tables, retrieve data, insert records, update existing information, and delete unwanted rows. SQL is supported by popular database systems such as SQL Server, PostgreSQL, MySQL, and Oracle. For .NET developers, SQL is particularly important because technologies such as Entity Framework Core ultimately generate SQL queries behind the scenes. Understanding SQL helps developers diagnose performance problems, optimize queries, and design efficient database schemas. Interviewers frequently ask SQL fundamentals because strong database knowledge is essential for building scalable business applications.
2. What is a primary key?
A primary key is a column or group of columns that uniquely identifies each row in a table. Primary keys enforce entity integrity by ensuring that duplicate records cannot exist. They are often implemented using identity columns or GUID values. Choosing an appropriate primary key is an important database design decision because it affects indexing, relationships, and query performance. Most relational databases automatically create a clustered index on the primary key unless configured otherwise.
3. What is a foreign key?
A foreign key is a database constraint that establishes a relationship between two tables. It ensures referential integrity by preventing invalid references between records. For example, an Orders table may contain a CustomerId foreign key that references the Customers table. Foreign keys help maintain data consistency and enable efficient joins across related entities. In Entity Framework Core, navigation properties are typically mapped to foreign key relationships in the underlying database.
4. What is normalization?
Normalization is the process of organizing data to reduce redundancy and improve consistency. Database designers typically apply normalization rules such as First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF). The goal is to eliminate duplicate data and ensure that each piece of information is stored in the most appropriate location. While normalization improves maintainability, excessive normalization can sometimes lead to complex queries and performance trade-offs.
5. What is denormalization?
Denormalization is the process of intentionally introducing redundancy into a database design. This technique is often used to improve query performance by reducing the number of joins required to retrieve data. Denormalization is common in reporting systems, data warehouses, and high-performance applications where read efficiency is more important than strict normalization. A senior developer should understand both normalization and denormalization and know when each approach is appropriate.
6. What is the difference between DELETE, TRUNCATE, and DROP?
DELETE removes rows from a table and can be filtered using a WHERE clause. TRUNCATE removes all rows from a table quickly and typically cannot be filtered. DROP completely removes the table structure along with its data. Understanding these commands is important because they have different effects on transactions, logging, permissions, and database recovery.
7. What is a JOIN?
A JOIN combines data from multiple tables based on a related column. Joins allow developers to retrieve meaningful business information that is distributed across multiple tables. Common examples include combining Orders with Customers or Products with Categories. Efficient use of joins is fundamental to relational database design and SQL query development. Interviewers frequently ask about joins because they are among the most commonly used SQL operations.
8. What is the difference between INNER JOIN and LEFT JOIN?
INNER JOIN returns only matching records from both tables. LEFT JOIN returns all records from the left table and matching records from the right table. If no match exists, NULL values are returned. Choosing the correct join type is important because it directly affects query results and business logic. Understanding join behavior is a core SQL interview topic.
9. What is a UNIQUE constraint?
A UNIQUE constraint ensures that duplicate values cannot exist within a column or combination of columns. Unlike a primary key, a table may contain multiple UNIQUE constraints. UNIQUE constraints are often used for business rules such as email addresses, usernames, or product codes where duplicates must be prevented. They improve data integrity and often create supporting indexes automatically.
10. What is a NULL value?
NULL represents the absence of a value rather than an empty string or zero. SQL treats NULL differently from normal values, which can affect comparisons, aggregations, and filtering operations. Developers must carefully handle NULL values to avoid unexpected query results and application bugs. Understanding NULL behavior is essential for writing reliable SQL queries.
🎁 Free .NET Interview PDF
Download 150 Real .NET Interview Questions
Includes C#, ASP.NET Core, Entity Framework, Async/Await, LINQ, System Design, Caching, Microservices and more.
No spam. Unsubscribe anytime.
11. What is an Index?
An index is a database structure that improves the speed of data retrieval operations. Without an index, the database may need to scan every row in a table to find matching records. Indexes work similarly to an index in a book by allowing the database engine to locate data much faster. While indexes improve read performance, they can slightly slow down INSERT, UPDATE, and DELETE operations because the index itself must also be maintained. Understanding indexes is one of the most important topics in SQL interviews because indexing directly impacts application performance.
12. What is a Clustered Index?
A clustered index determines the physical order of rows within a table. Because the data itself is stored according to the clustered index, a table can only have one clustered index. In SQL Server, the primary key often becomes the clustered index by default. Clustered indexes are highly efficient for range queries and sorting operations because related rows are stored together physically. Choosing an appropriate clustered index can significantly improve query performance in large databases.
13. What is a Non-Clustered Index?
A non-clustered index is a separate structure that contains indexed column values along with pointers to the actual data rows. Unlike clustered indexes, multiple non-clustered indexes can exist on a single table. They are useful for accelerating searches, filtering, and sorting operations. However, creating too many non-clustered indexes can increase storage requirements and slow down write operations. A senior developer should understand when additional indexes provide value and when they become unnecessary overhead.
14. How do indexes improve performance?
Indexes improve performance by reducing the amount of data that must be scanned to satisfy a query. Instead of examining every row in a table, the database engine can use an index to locate relevant records quickly. This reduces CPU usage, disk I/O, and query execution time. Indexes are particularly valuable in large tables where full table scans become expensive. Understanding how indexes work is essential for diagnosing slow queries and designing scalable database systems.
15. Can too many indexes hurt performance?
Yes. While indexes improve read performance, excessive indexing can negatively impact write performance. Every INSERT, UPDATE, or DELETE operation must also update affected indexes. As the number of indexes increases, database maintenance costs increase as well. Additional indexes also consume storage space and may increase query optimization complexity. Developers should create indexes based on actual query patterns rather than indexing every column indiscriminately.
16. What is index fragmentation?
Index fragmentation occurs when the logical order of index pages no longer matches their physical storage order. Fragmentation can increase disk I/O and reduce query performance because additional pages must be read to retrieve data. Database administrators often perform index maintenance operations such as rebuilding or reorganizing indexes to reduce fragmentation. Monitoring fragmentation is an important part of maintaining large production databases.
17. What is a covering index?
A covering index contains all columns required to satisfy a query. Because the database can retrieve all required data directly from the index, it avoids additional lookups to the underlying table. Covering indexes can significantly improve performance for frequently executed queries. However, they should be used carefully because larger indexes consume additional storage and increase maintenance overhead.
18. What is a composite index?
A composite index is an index built on multiple columns. It is useful when queries frequently filter or sort using more than one column. For example, an index on LastName and FirstName may improve searches involving both fields. The order of columns within a composite index is important because it affects which queries can benefit from the index. Understanding composite indexes demonstrates knowledge of advanced query optimization techniques.
19. What is query optimization?
Query optimization is the process of improving query performance while producing the same results. Optimization techniques include creating appropriate indexes, reducing unnecessary joins, selecting only required columns, and avoiding inefficient query patterns. Database engines also include query optimizers that automatically determine efficient execution plans. Developers who understand query optimization can significantly improve application scalability and user experience.
20. How do you identify slow SQL queries?
Slow SQL queries can be identified using execution plans, query profiling tools, monitoring dashboards, and database performance reports. Execution plans reveal how the database processes a query and can highlight expensive operations such as table scans, missing indexes, or inefficient joins. Developers should regularly monitor query performance in production environments to detect bottlenecks before they affect users. Performance tuning is one of the most valuable skills for senior developers working with data-intensive systems.
21. What is a transaction?
A transaction is a group of database operations that are treated as a single unit of work. Either all operations succeed together or all operations fail together. This ensures data consistency even when unexpected errors occur. Transactions are commonly used in banking systems, e-commerce platforms, and inventory management systems where partial updates could lead to incorrect data. Understanding transactions is essential for building reliable applications.
22. What is ACID?
ACID stands for Atomicity, Consistency, Isolation, and Durability. These four properties define the reliability guarantees provided by relational database transactions. Atomicity ensures all operations succeed or fail together. Consistency preserves valid data states. Isolation prevents concurrent transactions from interfering with each other. Durability guarantees committed data survives failures. ACID principles form the foundation of transactional database systems.
23. What is an isolation level?
An isolation level determines how transactions interact with each other when accessing shared data. Different isolation levels balance consistency and performance by controlling visibility of uncommitted changes and concurrent modifications. Common isolation levels include Read Uncommitted, Read Committed, Repeatable Read, and Serializable. Choosing the correct isolation level is important for preventing concurrency issues while maintaining scalability.
24. What is Read Uncommitted?
Read Uncommitted is the lowest isolation level available. Transactions can read data modified by other transactions even if those changes have not yet been committed. This may improve performance but introduces the risk of dirty reads, where applications access data that may later be rolled back. Because of these risks, Read Uncommitted is rarely used for critical business operations.
25. What is Read Committed?
Read Committed is the default isolation level in many database systems including SQL Server. Transactions can only read data that has already been committed. This prevents dirty reads while maintaining reasonable performance. Although Read Committed solves some concurrency issues, non-repeatable reads and phantom reads can still occur. It is often a good balance between consistency and scalability.
🎁 Free .NET Interview PDF
Download 150 Real .NET Interview Questions
Includes C#, ASP.NET Core, Entity Framework, Async/Await, LINQ, System Design, Caching, Microservices and more.
No spam. Unsubscribe anytime.
26. What is Repeatable Read?
Repeatable Read ensures that rows read during a transaction cannot be modified by other transactions until the current transaction completes. This prevents non-repeatable reads and improves consistency. However, it may increase locking and reduce concurrency because more resources remain locked for longer periods. Developers should understand the trade-offs between consistency and performance when choosing isolation levels.
27. What is Serializable?
Serializable is the highest standard isolation level. It guarantees that concurrent transactions produce results equivalent to executing one transaction at a time. While Serializable provides maximum consistency, it can significantly reduce concurrency and increase locking contention. It should be used only when strict consistency requirements justify the performance cost.
28. What is a deadlock?
A deadlock occurs when two or more transactions wait indefinitely for resources held by each other. For example, Transaction A may hold Lock 1 and wait for Lock 2, while Transaction B holds Lock 2 and waits for Lock 1. Because neither transaction can proceed, the database detects the deadlock and terminates one transaction as the victim. Deadlocks are common interview topics because they frequently occur in high-concurrency systems.
29. How do deadlocks occur?
Deadlocks typically occur when multiple transactions acquire resources in different orders. Long-running transactions, excessive locking, and inconsistent access patterns increase the likelihood of deadlocks. Although modern database engines automatically resolve deadlocks, excessive deadlocks can indicate architectural or query design problems. Developers should understand how locking behavior contributes to concurrency issues.
30. How can deadlocks be prevented?
Deadlocks can be reduced by accessing resources in a consistent order, keeping transactions short, creating proper indexes, and minimizing lock duration. Using lower isolation levels when appropriate can also reduce locking contention. Monitoring deadlock graphs and analyzing execution plans help identify problematic queries. Preventing deadlocks is often easier and more efficient than resolving them after deployment.
31. What is a Stored Procedure?
A Stored Procedure is a precompiled collection of SQL statements stored inside the database. Stored procedures can accept parameters, contain business logic, and execute multiple operations as a single unit. Because execution plans can be reused, stored procedures often improve performance and reduce network traffic between applications and databases. They are commonly used in enterprise systems for complex reporting, batch processing, and data manipulation tasks. However, excessive business logic inside stored procedures can make applications harder to maintain and test. A senior developer should understand both the benefits and trade-offs of using stored procedures.
32. What are the advantages of Stored Procedures?
Stored procedures provide several benefits including improved performance, centralized business logic, reduced network traffic, and enhanced security. Because SQL code executes directly on the database server, fewer round trips are required compared to sending multiple queries from an application. Stored procedures can also restrict direct table access and expose only approved operations. This improves security and helps enforce business rules. While modern ORMs reduce the need for stored procedures in many scenarios, they remain widely used in enterprise applications.
33. What is a View?
A View is a virtual table based on the result of a SQL query. Views simplify complex queries by encapsulating joins, filters, and aggregations behind a reusable database object. Applications can query a view as if it were a regular table. Views improve maintainability and can enhance security by exposing only specific columns or rows to users. They are frequently used for reporting, dashboards, and data abstraction layers.
34. What is a Common Table Expression (CTE)?
A Common Table Expression, or CTE, is a temporary named result set that exists only for the duration of a query. CTEs improve readability by breaking complex queries into smaller logical parts. They are especially useful for recursive operations such as hierarchical data traversal. Compared to nested subqueries, CTEs often make SQL easier to understand and maintain. Modern SQL development frequently uses CTEs to simplify reporting and analytical queries.
35. What is a Window Function?
A Window Function performs calculations across a set of rows related to the current row without collapsing the result set. Examples include ROW_NUMBER, RANK, DENSE_RANK, LAG, LEAD, and aggregate functions used with OVER clauses. Window functions are extremely useful for reporting, analytics, pagination, and ranking scenarios. Many senior-level SQL interviews include questions about window functions because they demonstrate advanced SQL knowledge.
36. What is ROW_NUMBER()?
ROW_NUMBER assigns a unique sequential number to each row within a result set. It is commonly used for pagination, ranking, and identifying duplicate records. Unlike RANK, ROW_NUMBER always generates unique values even when rows contain identical data. Understanding ROW_NUMBER is important because it appears frequently in real-world reporting and interview questions.
37. What is RANK()?
RANK assigns rankings to rows while allowing ties. If two rows share the same rank, the next rank value is skipped. For example, ranks may appear as 1, 2, 2, 4. This behavior makes RANK useful for leaderboards, reporting systems, and performance comparisons. Developers should understand the differences between RANK, DENSE_RANK, and ROW_NUMBER.
38. What is DENSE_RANK()?
DENSE_RANK is similar to RANK but does not skip ranking values after ties occur. For example, rankings appear as 1, 2, 2, 3 instead of 1, 2, 2, 4. This behavior is useful when consecutive rankings are required for business reporting and analytics. Understanding ranking functions demonstrates familiarity with advanced SQL techniques.
39. What is a Subquery?
A Subquery is a query nested inside another SQL query. Subqueries can be used in SELECT, FROM, WHERE, or HAVING clauses to perform intermediate calculations and filtering. Although powerful, excessive use of subqueries can sometimes reduce readability and performance. Developers should understand when subqueries are appropriate and when alternative approaches such as joins or CTEs are preferable.
40. What is the difference between a Subquery and a CTE?
A Subquery is embedded directly inside another query, whereas a CTE is defined separately at the beginning of a query. CTEs generally improve readability and maintainability for complex queries. They are particularly useful when logic must be reused multiple times within the same query. Subqueries are often suitable for simple operations, while CTEs are preferred for larger and more complex data transformations. Senior developers should understand the strengths and weaknesses of both approaches.
41. What is Database Sharding?
Database sharding is a scalability technique that distributes data across multiple database servers. Instead of storing all records in a single database, data is partitioned into smaller subsets called shards. Sharding improves scalability and reduces bottlenecks because workload is distributed across multiple machines. Large-scale systems such as social networks, e-commerce platforms, and SaaS products often use sharding to support millions of users.
42. What is Database Partitioning?
Partitioning divides large tables into smaller logical sections while remaining within the same database. This improves query performance, simplifies maintenance, and reduces the amount of data scanned during queries. Common partitioning strategies include range partitioning, list partitioning, and hash partitioning. Partitioning is often used for very large tables containing millions or billions of rows.
43. What is Database Replication?
Replication is the process of copying data from one database server to one or more secondary servers. Replication improves availability, fault tolerance, and read scalability. For example, applications may send write operations to a primary database while directing read operations to replicas. Understanding replication is important for designing highly available systems.
44. What is Eventual Consistency?
Eventual consistency is a consistency model commonly used in distributed systems. Instead of guaranteeing immediate consistency across all nodes, the system guarantees that data will become consistent over time. This approach improves scalability and availability but requires developers to handle temporary inconsistencies. Eventual consistency is widely used in cloud-native architectures and distributed databases.
45. What is CQRS?
CQRS stands for Command Query Responsibility Segregation. The pattern separates write operations from read operations, allowing each side to be optimized independently. CQRS is commonly combined with event sourcing, message queues, and microservices to improve scalability and maintainability. While powerful, CQRS introduces additional complexity and should only be applied when justified by business requirements.
46. How would you design a database for millions of users?
Designing a database for millions of users requires careful consideration of scalability, indexing, caching, partitioning, and replication. Developers should normalize data appropriately, create efficient indexes, minimize expensive joins, and use caching systems such as Redis to reduce database load. Monitoring and capacity planning are also essential because growth patterns often change over time. A strong answer should focus on scalability, performance, reliability, and maintainability.
47. How do you optimize large tables?
Large tables can be optimized through indexing, partitioning, archiving old data, query tuning, and efficient schema design. Developers should avoid SELECT *, use pagination for large result sets, and monitor execution plans regularly. As tables grow, maintenance tasks such as index rebuilding and statistics updates become increasingly important. Optimization should be guided by actual workload analysis rather than assumptions.
48. How do you handle database migrations?
Database migrations allow schema changes to be applied in a controlled and repeatable manner. Tools such as Entity Framework Core Migrations track schema changes through versioned migration files. A good migration strategy includes testing, rollback planning, automation, and minimizing downtime during deployments. Senior developers should understand how to evolve database schemas safely in production environments.
49. How do you troubleshoot production database issues?
Troubleshooting production database issues requires a systematic approach. Developers should examine query execution plans, monitor CPU and memory usage, analyze blocking and deadlocks, review indexes, and inspect application logs. Effective monitoring tools help identify bottlenecks before they impact users. Strong troubleshooting skills are highly valued because database issues often affect entire applications.
50. What SQL mistakes do developers commonly make?
Common SQL mistakes include using SELECT *, ignoring indexes, creating N+1 query problems, overusing cursors, writing inefficient joins, and failing to analyze execution plans. Other frequent issues include poor normalization, excessive locking, missing transaction boundaries, and retrieving more data than necessary. Most database performance problems originate from inefficient query design rather than hardware limitations. A senior developer should understand these pitfalls and know how to avoid them before they reach production systems.
Related Interview Questions
Ready for Senior .NET Interviews?
Get access to 200+ real interview questions covering C#, ASP.NET Core, EF Core, SQL, System Design, Microservices, Redis, and Async/Await. And think like a real senior-level.
Get .NET Interview Mastery Pro