Interview Questions on SQL: Find Top Data Developers

Most SQL interview processes are broken because they reward the wrong behavior. A candidate memorizes the difference between DELETE and TRUNCATE, recites it on cue, and walks out looking sharp. Meanwhile, nobody has learned whether they can untangle a messy join, explain a bad execution plan, or keep a reporting query from chewing through your cloud bill.

That's backwards.

SQL remains a core language for relational databases, and it's still a standard interview focus for developers, data analysts, and DBAs, as summarized in GeeksforGeeks' SQL interview overview. The problem isn't that companies ask interview questions on SQL. The problem is that they ask lazy ones.

Good SQL hiring doesn't come from trivia. It comes from watching how someone reasons. Can they break a problem into steps? Do they understand when a query is logically correct but operationally terrible? Can they explain tradeoffs across joins, grouping, nulls, and indexing without sounding like they swallowed a certification guide?

That's what separates a useful hire from a future incident report.

Modern SQL screens have gotten more practical. Coursera notes that interviews often include whiteboard tests, query reading, debugging, clause execution order, and output prediction, which is a lot closer to actual work than flash-card nonsense in its guide to SQL interview questions. Good. About time.

Below is the version I'd use. Not a museum of textbook definitions. A hiring playbook. Each question includes what a strong answer sounds like, what it really reveals, and the follow-up prompts that expose whether the candidate understands SQL or just rehearsed it in the hotel lobby.

1. What Is the Difference Between WHERE and HAVING Clauses?

This question is basic. That's exactly why it's useful.

A strong candidate should tell you WHERE filters rows before aggregation, while HAVING filters grouped results after aggregation. If they can't do that cleanly, stop pretending the rest of the interview is going well.

What this really tells you

It tells you whether the candidate understands query flow, not just syntax. SQL interviews often include execution-order thinking and debugging tasks, and this kind of question sits right in that lane, as noted earlier from Coursera.

Use a simple business example. Say you run an e-commerce system and want orders over a certain amount, then only customers who placed many qualifying orders. WHERE handles the order-level filter. HAVING handles the grouped customer threshold.

That distinction matters because people who misuse HAVING often write bloated queries that aggregate far more data than necessary.

Practical rule: If a filter can go in WHERE, it usually should. Don't make the database group junk rows just so you can throw them away later.

Follow-up prompts that expose real understanding

Ask these after they give the textbook answer:

  • Pushdown judgment: “If I move a condition from HAVING to WHERE, when does that change the result?”
  • Performance awareness: “Why is WHERE often better for large datasets?”
  • Concrete example: “Show me a query that uses both correctly.”

A good answer might use a SaaS billing example. Filter usage records for the current billing period in WHERE, group by department, then use HAVING to keep only departments over budget.

A weak candidate gives definitions. A strong one explains intent, correctness, and efficiency. That's the difference.

2. Explain INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN

If you ask only one category of interview questions on SQL, ask joins. Dataquest recommends starting with JOINs because they appear in almost every interview, alongside grouping, subqueries, and NULL handling, in its 2026 SQL interview guide.

That tracks with real hiring. Most production pain starts with bad joins, not exotic syntax.

A visual guide illustrating SQL joins using overlapping shapes, a magnifying glass, and office supplies.

What you want to hear

A candidate should explain:

  • INNER JOIN: Return matching rows from both tables.
  • LEFT JOIN: Return all rows from the left table, with matches from the right.
  • RIGHT JOIN: Return all rows from the right table, with matches from the left.
  • FULL OUTER JOIN: Return all matched and unmatched rows from both tables.

Nice. But don't stop there. Anyone can memorize Venn-diagram chatter.

The follow-up that matters

Ask for a real scenario.

A good one: “Show all customers and any orders they've placed.” That should trigger LEFT JOIN, because customers with no orders still matter. Another: “Find employees not assigned to projects.” Same pattern. If they default to INNER JOIN for everything, they're going to unintentionally drop important records in production and then blame the dashboard.

Ask what happens to unmatched rows. Ask where NULLs appear. Ask how duplicates can multiply when the relationship isn't one-to-one. That's where the rehearsed answers fall apart.

The best join answer isn't “INNER means intersection.” It's “I'd use LEFT JOIN here because the business wants missing relationships to remain visible.”

If you're interviewing for a data-heavy role, pair this with a more advanced prompt from CloudDevs' data engineer interview questions. The combination tells you who can join tables and who can think through the shape of a result set before they run the query.

3. What Is a PRIMARY KEY and What Are Its Characteristics?

A primary key is the column, or combination of columns, that uniquely identifies each row in a table. It shouldn't be null. It shouldn't be duplicated. It shouldn't be treated like optional plumbing someone adds “later” after the schema has already gone feral.

That last part is where real teams get burned.

What this question actually reveals

This isn't just a database fundamentals question. It tells you whether the candidate respects data identity.

If someone says, “We can always use the name field,” you've learned something valuable and unpleasant. If they say a primary key should be stable, unique, and chosen to support relationships cleanly, you're talking to someone who has seen a system survive contact with real users.

Use practical examples. employee_id in an HR system. order_id in commerce. A composite key like tenant_id plus resource_id in a multi-tenant application when uniqueness only exists inside the tenant boundary.

Follow-up prompts worth asking

Don't ask for a lecture on theory. Ask where people make bad decisions.

  • Natural vs surrogate keys: “Would you use email as a primary key?”
  • Distributed systems: “When would you prefer UUIDs over incrementing integers?”
  • Schema discipline: “What problems show up when a table has no primary key?”

A strong candidate will usually mention that meaningful data changes, which makes poor key material. They'll also mention that missing primary keys make updates, deduplication, and relationships much harder than they need to be.

If a table doesn't have a real primary key, it usually means the team gave up on modeling and hoped application code would save them. That rarely ends well.

Good hires don't just define a primary key. They defend why it matters.

4. What Is a FOREIGN KEY and How Does It Maintain Referential Integrity?

This one separates people who know SQL from people who merely visit it.

A foreign key links one table to another by referencing a primary key or unique key in the parent table. In plain English, it stops your database from inventing relationships that don't exist.

Why this question matters in practice

Say you have projects, developers, and assignments. Without foreign keys, an assignment can point to a developer who doesn't exist, a project that was deleted, or both. Congratulations, your data model is now fan fiction.

Foreign keys enforce referential integrity by requiring child rows to reference valid parent rows. That matters in billing systems, marketplaces, HR platforms, and any product where relationships are the business.

A strong candidate should also understand that foreign keys influence design decisions. Deleting a parent row can cascade, restrict, or fail depending on the rule you choose. None of those are “just syntax.” They reflect business intent.

Follow-up prompts that get past memorization

Ask questions with consequences:

  • Delete behavior: “When would you use ON DELETE CASCADE and when would you avoid it?”
  • Performance awareness: “Would you index foreign key columns?”
  • Operational caution: “How do soft deletes affect referential integrity?”

Good candidates usually say foreign key columns should be indexed when they're used in joins or lookups. They also know that cascade deletes can be convenient for strict parent-child data, but dangerous when records are shared, audited, or legally sensitive.

A foreign key is less about database purity and more about refusing to let bad application code make up fake relationships.

If they understand that, keep going. If they don't, don't hand them ownership of anything with billing, compliance, or customer data attached.

5. Explain the Difference Between UNION and UNION ALL

This is one of those questions that sounds small until someone gets it wrong in production.

UNION combines result sets and removes duplicates. UNION ALL combines result sets and keeps everything. That's the definition. Fine. The issue is whether the candidate understands when deduplication is helpful, when it's wasteful, and when it hides data problems.

Where people trip over this

Suppose you're combining historical orders with current-month orders during a migration. If the datasets are already mutually exclusive, UNION ALL is usually the honest choice. If you use UNION out of habit, you may force unnecessary deduplication work and mask overlap you needed to investigate.

On the other hand, if two systems can produce duplicate customer IDs and you need a unique list for outreach, UNION makes sense.

That's the true interview signal. Judgment.

Better follow-up prompts

Ask the candidate to choose between them in a scenario, then justify it.

  • Migration scenario: “We're combining old and new records during a cutover. Which one do you use?”
  • Reporting scenario: “I need a distinct list of active users across two sources. Now what?”
  • Debugging angle: “What if duplicates appear unexpectedly?”

A good candidate should mention that both queries must return compatible columns in the same order. They should also know that ORDER BY belongs at the end of the combined result, not sprinkled inside each branch like decorative parsley.

Short version: if they answer only with “one removes duplicates,” you've learned the minimum. If they explain data intent and performance tradeoffs, you've found someone who won't casually flatten your reporting logic.

6. What Is an INDEX and When Should You Use It?

Now we're getting somewhere.

A candidate who can discuss indexing clearly is usually far more useful than someone who can recite obscure syntax. Strong SQL performance depends on tuning habits like using EXPLAIN, indexing JOIN and WHERE columns, and filtering early for large datasets, as highlighted in this SQL sales analysis interview discussion.

That's the kind of answer you want. Practical. Not ceremonial.

A magnifying glass resting on an index divider tab organized alphabetically in a document filing system.

What a solid answer sounds like

An index is a data structure that helps the database find rows faster. It speeds up reads for certain access patterns. It also adds overhead to writes, because inserts, updates, and deletes may need to maintain the index.

That tradeoff matters. Indexing everything is not strategy. It's panic with admin privileges.

Ask for judgment, not slogans

Use scenarios like these:

  • Lookup-heavy workload: “Users frequently search orders by customer ID and date.”
  • Write-heavy table: “This event table gets constant inserts.”
  • Sorting pattern: “Dashboards always sort by created date within tenant.”

A good candidate should talk about indexing columns used in WHERE, JOIN, and ORDER BY. They may mention composite indexes when filters commonly use multiple columns together. They should also warn against blindly indexing low-value columns and should verify plans with EXPLAIN.

Hiring shortcut: Ask, “What's the downside of adding an index?” If they say “none,” end the suspense. They're not ready.

This question also opens the door to more modern, useful SQL evaluation. Today's interviews increasingly reward understanding optimizer behavior, execution plans, cardinality, and why a query is slow across different engines. That matters because PostgreSQL, MySQL, SQL Server, BigQuery, Snowflake, and Databricks don't behave identically. Anyone hiring serious SQL talent should care about that.

7. What Is Database Normalization and Why Is It Important?

Normalization gets mocked because people associate it with dusty theory and joyless whiteboards. That's unfair. Bad normalization creates real, expensive messes.

Normalization means organizing data to reduce redundancy and improve integrity. Store facts once, relate them properly, and stop duplicating business truth across random tables like you're hiding Easter eggs from your future self.

Three wooden cubes connected by lines with the labels Developers, Companies, and Projects inscribed on them.

What you're really testing

You're testing whether the candidate can design a schema that won't rot.

Say you store company names directly in every developer row. Now a company rebrands. Have fun updating that everywhere and praying nobody misspells it in one place. A normalized design moves company data into a companies table and references it cleanly.

That's not academic. That's maintenance sanity.

Where the good candidates separate themselves

Don't reward someone for rattling off normal forms like a trivia host. Ask them to explain a messy table they'd split apart and why.

  • Redundancy control: “What problems come from storing department names in every employee row?”
  • Tradeoff awareness: “When would you denormalize on purpose?”
  • Team thinking: “How do you keep schema design understandable for other engineers?”

A strong answer usually lands around normalizing to a practical level, often the point where redundancy is controlled and updates stay predictable. Then, only denormalize for a specific workload you can justify.

If you want a broader framework for evaluating answers, CloudDevs' database design best practices is a useful companion because it pushes the conversation toward maintainable schemas instead of interview theater.

Normalization isn't about making schemas pretty. It's about making bad data harder to create.

That's a standard worth keeping.

8. What Is a VIEW and When Would You Use It?

A view is a saved SELECT query that acts like a virtual table. Useful, yes. Magical, no.

This question matters because it exposes whether the candidate knows how to simplify access patterns without turning the database into a hall of mirrors.

The practical use cases

Views shine when teams repeatedly need the same logic.

Maybe finance needs a monthly billing summary. Maybe customer support needs a safe public profile view that excludes salary or contract fields. Maybe analysts keep rebuilding the same join across users, subscriptions, and usage events. A view can centralize that logic and reduce repeated mistakes.

That said, views are not a free lunch. A candidate should know that piling views on views on views can make debugging miserable. If a query is already hard to understand, hiding it behind another layer of abstraction doesn't fix the problem. It just makes the next person curse under their breath.

Good follow-up prompts

Ask these instead of “Can views be updated?” and other quiz-show bait.

  • Security angle: “Would you use a view to limit access to sensitive columns?”
  • Maintainability: “When does a view help more than a copied query in app code?”
  • Performance awareness: “What would make you consider a materialized view instead?”

A strong answer usually says views are good for consistency, access control, and reusable reporting logic. A better one adds that expensive aggregations may need materialization depending on the engine and workload.

A view is a contract. It tells the rest of the team, “Use this shape of data, not your own improvised version.”

That mindset is more valuable than the definition alone.

9. What Is the Difference Between DISTINCT and GROUP BY?

Hiring managers should ask this question more carefully than they usually do, because it exposes a common and expensive mistake. Plenty of candidates can recite definitions. Fewer can tell you which clause fits the business question in front of them.

DISTINCT removes duplicate rows from the result set. GROUP BY groups rows so you can summarize them with aggregates like COUNT, SUM, or AVG.

The syntax overlap fools weak candidates.

Ask for a concrete example and watch what happens. If you need a list of countries where your developers work, SELECT DISTINCT country is the clean answer. If you need the number of developers in each country, you use GROUP BY country with COUNT(*). Same column. Different job.

That distinction matters in production. A candidate who treats DISTINCT and GROUP BY as interchangeable usually writes muddy reporting queries, hides duplicate-join problems with DISTINCT, and ships totals nobody trusts.

What this question really tells you

This is not a vocabulary test. It tells you whether the candidate understands query intent.

Good candidates explain the difference in plain English: DISTINCT deduplicates output, while GROUP BY changes the grain of the result so you can aggregate. Strong candidates go one step further and point out that using DISTINCT to "fix" duplicate rows is often a red flag. It can mask a bad join or a broken data model instead of solving the actual issue.

That answer separates someone who debugs SQL from someone who decorates it.

Follow-up prompts that expose real skill

Use prompts that force judgment, not memorization.

  • Dedup judgment: “When is DISTINCT the right answer, and when is it hiding a join problem?”
  • Aggregation judgment: “Show me the count of developers by country, then explain why DISTINCT would be the wrong tool.”
  • SQL rules: “Why does GROUP BY fail if I select a column that isn't grouped or aggregated?”
  • Edge-case thinking: “Can GROUP BY return the same rows as DISTINCT? If yes, why would you still prefer one over the other?”

A solid candidate says non-grouped columns usually need an aggregate. A stronger one adds that while GROUP BY can sometimes mimic DISTINCT, DISTINCT is clearer when you only need uniqueness and no summary calculation.

DISTINCT answers, “Which values exist?” GROUP BY answers, “How do these rows roll up?”

Ask that way and you learn who understands SQL logic, not just SQL syntax.

10. What Is a SUBQUERY and How Do You Use It?

Subqueries are where candidates either show composure or start building a nest of parentheses they can't escape.

A subquery is a query nested inside another query. Used well, it breaks a problem into manageable parts. Used badly, it turns a readable statement into a hostage situation.

What you want to hear

A solid answer should explain common uses like filtering against a calculated value, checking existence, or staging logic before a final result.

Classic example: find employees paid above the average salary. The subquery calculates the average, and the outer query filters against it. Straightforward. Useful. No drama.

For more advanced roles, this gets more interesting. Dataquest notes that interviews for data analysts increasingly emphasize JOINs, GROUP BY, subqueries, NULL handling, and especially window functions, while live-coding screens reward problem restatement, relevant column selection, and query decomposition in its guide mentioned earlier. That's a key indicator. Can they decompose the problem calmly?

Better follow-up prompts

Don't ask “What is a correlated subquery?” and sit back like you've done something clever. Ask for tradeoffs.

  • Rewrite judgment: “When would you replace a subquery with a join?”
  • Scale awareness: “When would EXISTS be better than IN?”
  • Debugging instinct: “How would you validate a subquery before nesting it?”

A good candidate often says they test the inner query on its own first. Excellent. That's what people do who plan to finish the task this century.

They should also know that some subqueries are elegant, while others should be rewritten for readability or performance.

If a candidate can explain a subquery and then improve it, you're probably talking to someone who writes SQL for work, not just interviews.

10 Essential SQL Interview Questions Compared

Item Implementation complexity Resource requirements Expected outcomes Ideal use cases Key advantages
WHERE vs HAVING Low, placement in query matters (WHERE before aggregation, HAVING after) WHERE: minimal; HAVING: requires aggregation memory/CPU Correct stage-specific filtering; better performance when filtering pre-aggregation Filter rows before GROUP BY (WHERE); filter aggregated groups (HAVING) WHERE: early filtering for speed; HAVING: filters on aggregates
JOIN types (INNER, LEFT, RIGHT, FULL) Medium, join logic and aliasing can be complex with many tables Can be I/O and memory intensive on large tables; benefits from indexed join columns Combined datasets with different inclusion semantics (matches vs outer rows) Combining related tables for reports, enrichment, and data integration Flexible data combination; handles missing data via outer joins
PRIMARY KEY Low, simple to declare, design choices can be impactful Small storage for index; automatic index maintenance Uniquely identifies rows and speeds lookups Defining unique identifiers for entities (users, orders, instances) Ensures uniqueness, fast lookups, supports foreign keys
FOREIGN KEY Medium, requires schema planning and cascade rules Minor storage/index overhead; extra write-time checks Enforced referential integrity; prevents orphaned rows Enforcing relationships (projects ? developers, invoices ? companies) Automatic integrity enforcement; documents relationships
UNION vs UNION ALL Low, syntax straightforward; behavior differs on duplicates UNION: higher CPU/memory for dedup and sort; UNION ALL: minimal overhead Combined result sets; UNION removes duplicates, UNION ALL preserves them Merging datasets from multiple sources or timeframes UNION: unique merged results; UNION ALL: faster, preserves duplicates
INDEX Medium, requires selection of columns and maintenance strategy Increased storage; read speed improved, write operations slower Faster SELECT/JOIN/ORDER queries; slower INSERT/UPDATE/DELETE Speeding frequent lookups, joins, and ordered queries on large tables Dramatically improves read performance when used appropriately
Normalization High, requires careful schema design and trade-offs More tables and joins may increase read-time resources; less redundancy storage Reduced redundancy and anomalies; easier updates and consistency Foundational schema design for multi-tenant, maintainable databases Improves data integrity and reduces storage redundancy
VIEW Low, simple to create; complexity depends on underlying query Non-materialized: little storage, runtime cost; materialized: storage + refresh cost Encapsulated queries, consistent interfaces; may hide complexity Simplifying reports, security filters, reusable complex queries Abstraction, reusability, security via column/row restriction
DISTINCT vs GROUP BY Low, syntax simple; purpose differs (dedupe vs aggregate) DISTINCT: sorting/dedup cost; GROUP BY: aggregation memory/CPU DISTINCT returns unique rows; GROUP BY returns grouped aggregates Use DISTINCT for uniqueness; GROUP BY for aggregated metrics and analytics DISTINCT: simple deduplication; GROUP BY: supports aggregation functions
SUBQUERY (nested query) Medium to high, correlated subqueries add complexity Can be expensive; correlated subqueries may run repeatedly; JOINs often cheaper Enables stepwise filtering and derived comparisons; may be slower Multi-step filters, existence checks, derived-value comparisons Expressive and modular for complex logic; isolates steps within a query

Beyond the Whiteboard and the Real SQL Test

A good SQL interview isn't a pop quiz. It's an x-ray.

The questions above matter because they uncover how someone thinks about data shape, correctness, relationships, filtering, reuse, and performance. But on their own, they're still incomplete. A candidate can answer all of them reasonably well and still struggle when the data is messy, the business definition is vague, or the query runs slowly for reasons no textbook bothered to mention.

That's why the true test needs one practical exercise.

Give them a small but imperfect dataset. Include duplicates. Include missing values. Include a business prompt that sounds like something your team deals with. Ask them to define the metric, explain assumptions, write the query, and then talk through how they'd validate the result. Don't rescue them too quickly. Let them think out loud.

That's where the difference shows up.

The strongest candidates don't race to type. They restate the problem. They identify the tables and columns that matter. They notice where NULL handling changes the output. They ask clarifying questions about edge cases. If the role leans analytical, they may reach for window functions, because advanced SQL interviews often use real analytical workloads like month-over-month comparisons, top-N ranking, cumulative metrics, partitioning, date truncation, and rolling aggregations rather than syntax trivia, as discussed in the verified material above.

That's the person you want.

And if your environment uses a modern warehouse stack, stop pretending generic textbook SQL is enough. Many interview resources still center on joins, grouping, and window functions but skip the harder issue of engine behavior and portability. In real teams, candidates often need to understand optimizer behavior, execution plans, CTE materialization, index usage, cardinality estimates, and dialect quirks across PostgreSQL, MySQL, SQL Server, BigQuery, Snowflake, or Databricks. If you don't test for that where relevant, you're hiring for the wrong century.

The same goes for analytics hiring. A lot of teams no longer use SQL just to fetch rows from neat transactional tables. They use it to define metrics, interrogate event streams, handle duplicates, deal with late-arriving records, and validate business logic in cloud warehouses. Increasingly, they also need people who can review SQL generated by AI tools and spot what the model got wrong. That judgment matters. A candidate who can critique generated SQL is usually more valuable than one who can merely produce boilerplate from memory.

So here's the blunt version.

Ask fewer trivia questions. Ask better ones. Use the conceptual questions in this list to establish fundamentals, then move quickly into a hands-on exercise with realistic constraints. Watch how the person reasons when the answer isn't prepackaged. That's how you identify someone who will improve your data work instead of turning every dashboard bug into a three-team blame ritual.

If you want extra perspective on the people side of screening, Professional Careers Training's data analyst tips adds a useful complement around interview readiness and communication. Those soft edges matter once the SQL basics are covered.

And if your team doesn't want to spend its week building, refining, and administering technical screens, using a marketplace such as CloudDevs can be one practical option. CloudDevs states that it pre-vets Latin American developers and can provide a shortlist quickly, which may help companies that need to move faster without lowering the bar.


If you need developers who can handle real SQL work instead of just surviving interview theater, CloudDevs is worth a look. You can review pre-vetted LATAM talent, skip a chunk of the screening grind, and spend your time interviewing people who already clear the fundamentals.

Victor

Victor

Author

Senior Developer Spotify at Cloud Devs

As a Senior Developer at Spotify and part of the Cloud Devs talent network, I bring real-world experience from scaling global platforms to every project I take on. Writing on behalf of Cloud Devs, I share insights from the field—what actually works when building fast, reliable, and user-focused software at scale.

Related Articles

.. .. ..

Ready to make the switch to CloudDevs?

Hire today
7 day risk-free trial

Want to learn more?

Book a call