Interview Questions on SQL: Find Top Data Developers




Most SQL interview processes are broken because they reward the wrong behavior. A candidate memorizes the difference between DELETE and TRUNCATE, recites it on cue, and walks out looking sharp. Meanwhile, nobody has learned whether they can untangle a messy join, explain a bad execution plan, or keep a reporting query from chewing through your cloud bill.
That's backwards.
SQL remains a core language for relational databases, and it's still a standard interview focus for developers, data analysts, and DBAs, as summarized in GeeksforGeeks' SQL interview overview. The problem isn't that companies ask interview questions on SQL. The problem is that they ask lazy ones.
Good SQL hiring doesn't come from trivia. It comes from watching how someone reasons. Can they break a problem into steps? Do they understand when a query is logically correct but operationally terrible? Can they explain tradeoffs across joins, grouping, nulls, and indexing without sounding like they swallowed a certification guide?
That's what separates a useful hire from a future incident report.
Modern SQL screens have gotten more practical. Coursera notes that interviews often include whiteboard tests, query reading, debugging, clause execution order, and output prediction, which is a lot closer to actual work than flash-card nonsense in its guide to SQL interview questions. Good. About time.
Below is the version I'd use. Not a museum of textbook definitions. A hiring playbook. Each question includes what a strong answer sounds like, what it really reveals, and the follow-up prompts that expose whether the candidate understands SQL or just rehearsed it in the hotel lobby.
Table of Contents
This question is basic. That's exactly why it's useful.
A strong candidate should tell you WHERE filters rows before aggregation, while HAVING filters grouped results after aggregation. If they can't do that cleanly, stop pretending the rest of the interview is going well.
It tells you whether the candidate understands query flow, not just syntax. SQL interviews often include execution-order thinking and debugging tasks, and this kind of question sits right in that lane, as noted earlier from Coursera.
Use a simple business example. Say you run an e-commerce system and want orders over a certain amount, then only customers who placed many qualifying orders. WHERE handles the order-level filter. HAVING handles the grouped customer threshold.
That distinction matters because people who misuse HAVING often write bloated queries that aggregate far more data than necessary.
Practical rule: If a filter can go in
WHERE, it usually should. Don't make the database group junk rows just so you can throw them away later.
Ask these after they give the textbook answer:
HAVING to WHERE, when does that change the result?”WHERE often better for large datasets?”A good answer might use a SaaS billing example. Filter usage records for the current billing period in WHERE, group by department, then use HAVING to keep only departments over budget.
A weak candidate gives definitions. A strong one explains intent, correctness, and efficiency. That's the difference.
If you ask only one category of interview questions on SQL, ask joins. Dataquest recommends starting with JOINs because they appear in almost every interview, alongside grouping, subqueries, and NULL handling, in its 2026 SQL interview guide.
That tracks with real hiring. Most production pain starts with bad joins, not exotic syntax.
A candidate should explain:
Nice. But don't stop there. Anyone can memorize Venn-diagram chatter.
Ask for a real scenario.
A good one: “Show all customers and any orders they've placed.” That should trigger LEFT JOIN, because customers with no orders still matter. Another: “Find employees not assigned to projects.” Same pattern. If they default to INNER JOIN for everything, they're going to unintentionally drop important records in production and then blame the dashboard.
Ask what happens to unmatched rows. Ask where NULLs appear. Ask how duplicates can multiply when the relationship isn't one-to-one. That's where the rehearsed answers fall apart.
The best join answer isn't “INNER means intersection.” It's “I'd use LEFT JOIN here because the business wants missing relationships to remain visible.”
If you're interviewing for a data-heavy role, pair this with a more advanced prompt from CloudDevs' data engineer interview questions. The combination tells you who can join tables and who can think through the shape of a result set before they run the query.
A primary key is the column, or combination of columns, that uniquely identifies each row in a table. It shouldn't be null. It shouldn't be duplicated. It shouldn't be treated like optional plumbing someone adds “later” after the schema has already gone feral.
That last part is where real teams get burned.
This isn't just a database fundamentals question. It tells you whether the candidate respects data identity.
If someone says, “We can always use the name field,” you've learned something valuable and unpleasant. If they say a primary key should be stable, unique, and chosen to support relationships cleanly, you're talking to someone who has seen a system survive contact with real users.
Use practical examples. employee_id in an HR system. order_id in commerce. A composite key like tenant_id plus resource_id in a multi-tenant application when uniqueness only exists inside the tenant boundary.
Don't ask for a lecture on theory. Ask where people make bad decisions.
A strong candidate will usually mention that meaningful data changes, which makes poor key material. They'll also mention that missing primary keys make updates, deduplication, and relationships much harder than they need to be.
If a table doesn't have a real primary key, it usually means the team gave up on modeling and hoped application code would save them. That rarely ends well.
Good hires don't just define a primary key. They defend why it matters.
This one separates people who know SQL from people who merely visit it.
A foreign key links one table to another by referencing a primary key or unique key in the parent table. In plain English, it stops your database from inventing relationships that don't exist.
Say you have projects, developers, and assignments. Without foreign keys, an assignment can point to a developer who doesn't exist, a project that was deleted, or both. Congratulations, your data model is now fan fiction.
Foreign keys enforce referential integrity by requiring child rows to reference valid parent rows. That matters in billing systems, marketplaces, HR platforms, and any product where relationships are the business.
A strong candidate should also understand that foreign keys influence design decisions. Deleting a parent row can cascade, restrict, or fail depending on the rule you choose. None of those are “just syntax.” They reflect business intent.
Ask questions with consequences:
ON DELETE CASCADE and when would you avoid it?”Good candidates usually say foreign key columns should be indexed when they're used in joins or lookups. They also know that cascade deletes can be convenient for strict parent-child data, but dangerous when records are shared, audited, or legally sensitive.
A foreign key is less about database purity and more about refusing to let bad application code make up fake relationships.
If they understand that, keep going. If they don't, don't hand them ownership of anything with billing, compliance, or customer data attached.
This is one of those questions that sounds small until someone gets it wrong in production.
UNION combines result sets and removes duplicates. UNION ALL combines result sets and keeps everything. That's the definition. Fine. The issue is whether the candidate understands when deduplication is helpful, when it's wasteful, and when it hides data problems.
Suppose you're combining historical orders with current-month orders during a migration. If the datasets are already mutually exclusive, UNION ALL is usually the honest choice. If you use UNION out of habit, you may force unnecessary deduplication work and mask overlap you needed to investigate.
On the other hand, if two systems can produce duplicate customer IDs and you need a unique list for outreach, UNION makes sense.
That's the true interview signal. Judgment.
Ask the candidate to choose between them in a scenario, then justify it.
A good candidate should mention that both queries must return compatible columns in the same order. They should also know that ORDER BY belongs at the end of the combined result, not sprinkled inside each branch like decorative parsley.
Short version: if they answer only with “one removes duplicates,” you've learned the minimum. If they explain data intent and performance tradeoffs, you've found someone who won't casually flatten your reporting logic.
Now we're getting somewhere.
A candidate who can discuss indexing clearly is usually far more useful than someone who can recite obscure syntax. Strong SQL performance depends on tuning habits like using EXPLAIN, indexing JOIN and WHERE columns, and filtering early for large datasets, as highlighted in this SQL sales analysis interview discussion.
That's the kind of answer you want. Practical. Not ceremonial.
An index is a data structure that helps the database find rows faster. It speeds up reads for certain access patterns. It also adds overhead to writes, because inserts, updates, and deletes may need to maintain the index.
That tradeoff matters. Indexing everything is not strategy. It's panic with admin privileges.
Use scenarios like these:
A good candidate should talk about indexing columns used in WHERE, JOIN, and ORDER BY. They may mention composite indexes when filters commonly use multiple columns together. They should also warn against blindly indexing low-value columns and should verify plans with EXPLAIN.
Hiring shortcut: Ask, “What's the downside of adding an index?” If they say “none,” end the suspense. They're not ready.
This question also opens the door to more modern, useful SQL evaluation. Today's interviews increasingly reward understanding optimizer behavior, execution plans, cardinality, and why a query is slow across different engines. That matters because PostgreSQL, MySQL, SQL Server, BigQuery, Snowflake, and Databricks don't behave identically. Anyone hiring serious SQL talent should care about that.
Normalization gets mocked because people associate it with dusty theory and joyless whiteboards. That's unfair. Bad normalization creates real, expensive messes.
Normalization means organizing data to reduce redundancy and improve integrity. Store facts once, relate them properly, and stop duplicating business truth across random tables like you're hiding Easter eggs from your future self.
You're testing whether the candidate can design a schema that won't rot.
Say you store company names directly in every developer row. Now a company rebrands. Have fun updating that everywhere and praying nobody misspells it in one place. A normalized design moves company data into a companies table and references it cleanly.
That's not academic. That's maintenance sanity.
Don't reward someone for rattling off normal forms like a trivia host. Ask them to explain a messy table they'd split apart and why.
A strong answer usually lands around normalizing to a practical level, often the point where redundancy is controlled and updates stay predictable. Then, only denormalize for a specific workload you can justify.
If you want a broader framework for evaluating answers, CloudDevs' database design best practices is a useful companion because it pushes the conversation toward maintainable schemas instead of interview theater.
Normalization isn't about making schemas pretty. It's about making bad data harder to create.
That's a standard worth keeping.
A view is a saved SELECT query that acts like a virtual table. Useful, yes. Magical, no.
This question matters because it exposes whether the candidate knows how to simplify access patterns without turning the database into a hall of mirrors.
Views shine when teams repeatedly need the same logic.
Maybe finance needs a monthly billing summary. Maybe customer support needs a safe public profile view that excludes salary or contract fields. Maybe analysts keep rebuilding the same join across users, subscriptions, and usage events. A view can centralize that logic and reduce repeated mistakes.
That said, views are not a free lunch. A candidate should know that piling views on views on views can make debugging miserable. If a query is already hard to understand, hiding it behind another layer of abstraction doesn't fix the problem. It just makes the next person curse under their breath.
Ask these instead of “Can views be updated?” and other quiz-show bait.
A strong answer usually says views are good for consistency, access control, and reusable reporting logic. A better one adds that expensive aggregations may need materialization depending on the engine and workload.
A view is a contract. It tells the rest of the team, “Use this shape of data, not your own improvised version.”
That mindset is more valuable than the definition alone.
Hiring managers should ask this question more carefully than they usually do, because it exposes a common and expensive mistake. Plenty of candidates can recite definitions. Fewer can tell you which clause fits the business question in front of them.
DISTINCT removes duplicate rows from the result set. GROUP BY groups rows so you can summarize them with aggregates like COUNT, SUM, or AVG.
The syntax overlap fools weak candidates.
Ask for a concrete example and watch what happens. If you need a list of countries where your developers work, SELECT DISTINCT country is the clean answer. If you need the number of developers in each country, you use GROUP BY country with COUNT(*). Same column. Different job.
That distinction matters in production. A candidate who treats DISTINCT and GROUP BY as interchangeable usually writes muddy reporting queries, hides duplicate-join problems with DISTINCT, and ships totals nobody trusts.
This is not a vocabulary test. It tells you whether the candidate understands query intent.
Good candidates explain the difference in plain English: DISTINCT deduplicates output, while GROUP BY changes the grain of the result so you can aggregate. Strong candidates go one step further and point out that using DISTINCT to "fix" duplicate rows is often a red flag. It can mask a bad join or a broken data model instead of solving the actual issue.
That answer separates someone who debugs SQL from someone who decorates it.
Use prompts that force judgment, not memorization.
DISTINCT the right answer, and when is it hiding a join problem?”DISTINCT would be the wrong tool.”GROUP BY fail if I select a column that isn't grouped or aggregated?”GROUP BY return the same rows as DISTINCT? If yes, why would you still prefer one over the other?”A solid candidate says non-grouped columns usually need an aggregate. A stronger one adds that while GROUP BY can sometimes mimic DISTINCT, DISTINCT is clearer when you only need uniqueness and no summary calculation.
DISTINCTanswers, “Which values exist?”GROUP BYanswers, “How do these rows roll up?”
Ask that way and you learn who understands SQL logic, not just SQL syntax.
Subqueries are where candidates either show composure or start building a nest of parentheses they can't escape.
A subquery is a query nested inside another query. Used well, it breaks a problem into manageable parts. Used badly, it turns a readable statement into a hostage situation.
A solid answer should explain common uses like filtering against a calculated value, checking existence, or staging logic before a final result.
Classic example: find employees paid above the average salary. The subquery calculates the average, and the outer query filters against it. Straightforward. Useful. No drama.
For more advanced roles, this gets more interesting. Dataquest notes that interviews for data analysts increasingly emphasize JOINs, GROUP BY, subqueries, NULL handling, and especially window functions, while live-coding screens reward problem restatement, relevant column selection, and query decomposition in its guide mentioned earlier. That's a key indicator. Can they decompose the problem calmly?
Don't ask “What is a correlated subquery?” and sit back like you've done something clever. Ask for tradeoffs.
EXISTS be better than IN?”A good candidate often says they test the inner query on its own first. Excellent. That's what people do who plan to finish the task this century.
They should also know that some subqueries are elegant, while others should be rewritten for readability or performance.
If a candidate can explain a subquery and then improve it, you're probably talking to someone who writes SQL for work, not just interviews.
| Item | Implementation complexity | Resource requirements | Expected outcomes | Ideal use cases | Key advantages |
|---|---|---|---|---|---|
| WHERE vs HAVING | Low, placement in query matters (WHERE before aggregation, HAVING after) | WHERE: minimal; HAVING: requires aggregation memory/CPU | Correct stage-specific filtering; better performance when filtering pre-aggregation | Filter rows before GROUP BY (WHERE); filter aggregated groups (HAVING) | WHERE: early filtering for speed; HAVING: filters on aggregates |
| JOIN types (INNER, LEFT, RIGHT, FULL) | Medium, join logic and aliasing can be complex with many tables | Can be I/O and memory intensive on large tables; benefits from indexed join columns | Combined datasets with different inclusion semantics (matches vs outer rows) | Combining related tables for reports, enrichment, and data integration | Flexible data combination; handles missing data via outer joins |
| PRIMARY KEY | Low, simple to declare, design choices can be impactful | Small storage for index; automatic index maintenance | Uniquely identifies rows and speeds lookups | Defining unique identifiers for entities (users, orders, instances) | Ensures uniqueness, fast lookups, supports foreign keys |
| FOREIGN KEY | Medium, requires schema planning and cascade rules | Minor storage/index overhead; extra write-time checks | Enforced referential integrity; prevents orphaned rows | Enforcing relationships (projects ? developers, invoices ? companies) | Automatic integrity enforcement; documents relationships |
| UNION vs UNION ALL | Low, syntax straightforward; behavior differs on duplicates | UNION: higher CPU/memory for dedup and sort; UNION ALL: minimal overhead | Combined result sets; UNION removes duplicates, UNION ALL preserves them | Merging datasets from multiple sources or timeframes | UNION: unique merged results; UNION ALL: faster, preserves duplicates |
| INDEX | Medium, requires selection of columns and maintenance strategy | Increased storage; read speed improved, write operations slower | Faster SELECT/JOIN/ORDER queries; slower INSERT/UPDATE/DELETE | Speeding frequent lookups, joins, and ordered queries on large tables | Dramatically improves read performance when used appropriately |
| Normalization | High, requires careful schema design and trade-offs | More tables and joins may increase read-time resources; less redundancy storage | Reduced redundancy and anomalies; easier updates and consistency | Foundational schema design for multi-tenant, maintainable databases | Improves data integrity and reduces storage redundancy |
| VIEW | Low, simple to create; complexity depends on underlying query | Non-materialized: little storage, runtime cost; materialized: storage + refresh cost | Encapsulated queries, consistent interfaces; may hide complexity | Simplifying reports, security filters, reusable complex queries | Abstraction, reusability, security via column/row restriction |
| DISTINCT vs GROUP BY | Low, syntax simple; purpose differs (dedupe vs aggregate) | DISTINCT: sorting/dedup cost; GROUP BY: aggregation memory/CPU | DISTINCT returns unique rows; GROUP BY returns grouped aggregates | Use DISTINCT for uniqueness; GROUP BY for aggregated metrics and analytics | DISTINCT: simple deduplication; GROUP BY: supports aggregation functions |
| SUBQUERY (nested query) | Medium to high, correlated subqueries add complexity | Can be expensive; correlated subqueries may run repeatedly; JOINs often cheaper | Enables stepwise filtering and derived comparisons; may be slower | Multi-step filters, existence checks, derived-value comparisons | Expressive and modular for complex logic; isolates steps within a query |
A good SQL interview isn't a pop quiz. It's an x-ray.
The questions above matter because they uncover how someone thinks about data shape, correctness, relationships, filtering, reuse, and performance. But on their own, they're still incomplete. A candidate can answer all of them reasonably well and still struggle when the data is messy, the business definition is vague, or the query runs slowly for reasons no textbook bothered to mention.
That's why the true test needs one practical exercise.
Give them a small but imperfect dataset. Include duplicates. Include missing values. Include a business prompt that sounds like something your team deals with. Ask them to define the metric, explain assumptions, write the query, and then talk through how they'd validate the result. Don't rescue them too quickly. Let them think out loud.
That's where the difference shows up.
The strongest candidates don't race to type. They restate the problem. They identify the tables and columns that matter. They notice where NULL handling changes the output. They ask clarifying questions about edge cases. If the role leans analytical, they may reach for window functions, because advanced SQL interviews often use real analytical workloads like month-over-month comparisons, top-N ranking, cumulative metrics, partitioning, date truncation, and rolling aggregations rather than syntax trivia, as discussed in the verified material above.
That's the person you want.
And if your environment uses a modern warehouse stack, stop pretending generic textbook SQL is enough. Many interview resources still center on joins, grouping, and window functions but skip the harder issue of engine behavior and portability. In real teams, candidates often need to understand optimizer behavior, execution plans, CTE materialization, index usage, cardinality estimates, and dialect quirks across PostgreSQL, MySQL, SQL Server, BigQuery, Snowflake, or Databricks. If you don't test for that where relevant, you're hiring for the wrong century.
The same goes for analytics hiring. A lot of teams no longer use SQL just to fetch rows from neat transactional tables. They use it to define metrics, interrogate event streams, handle duplicates, deal with late-arriving records, and validate business logic in cloud warehouses. Increasingly, they also need people who can review SQL generated by AI tools and spot what the model got wrong. That judgment matters. A candidate who can critique generated SQL is usually more valuable than one who can merely produce boilerplate from memory.
So here's the blunt version.
Ask fewer trivia questions. Ask better ones. Use the conceptual questions in this list to establish fundamentals, then move quickly into a hands-on exercise with realistic constraints. Watch how the person reasons when the answer isn't prepackaged. That's how you identify someone who will improve your data work instead of turning every dashboard bug into a three-team blame ritual.
If you want extra perspective on the people side of screening, Professional Careers Training's data analyst tips adds a useful complement around interview readiness and communication. Those soft edges matter once the SQL basics are covered.
And if your team doesn't want to spend its week building, refining, and administering technical screens, using a marketplace such as CloudDevs can be one practical option. CloudDevs states that it pre-vets Latin American developers and can provide a shortlist quickly, which may help companies that need to move faster without lowering the bar.
If you need developers who can handle real SQL work instead of just surviving interview theater, CloudDevs is worth a look. You can review pre-vetted LATAM talent, skip a chunk of the screening grind, and spend your time interviewing people who already clear the fundamentals.
Let’s be honest: hiring contractors isn't just about finding talent. It’s about avoiding disaster. Turns out there’s more than one way to hire elite developers without mortgaging your office ping-pong table. The real work is taking a hard look at the hidden costs of a bad hire—wasted time, torched projects, and a burnt-out team—and building...
Boost your team's output with 9 agile development best practices for planning, communication, and continuous improvement—spark faster delivery today.
Discover how to build effective software development plans that drive project success. Learn practical strategies for scope, budgeting, and team management.