Data Structures Interview Questions

Question 1

What is the difference between an array and a linked list?

Answer

Arrays store elements in contiguous memory with O(1) random access by index but O(n) insertions and deletions in the middle due to shifting. Linked lists store elements in nodes with pointers, allowing O(1) insertions and deletions at known positions but requiring O(n) traversal for access. Arrays have better cache locality, while linked lists use memory proportional to their size without needing reallocation.

Question 2

How does a hash table work and how are collisions handled?

Answer

A hash table maps keys to array indices using a hash function, providing average-case O(1) lookups, insertions, and deletions. Collisions (when two keys hash to the same index) are resolved through chaining (storing a linked list at each index) or open addressing (probing for the next available slot). Good hash functions and appropriate load factors minimize collisions and maintain performance.

Question 3

What is a binary search tree and what are its time complexities?

Answer

A binary search tree (BST) is a tree where each node's left subtree contains values less than the node and the right subtree contains values greater. Search, insertion, and deletion are O(log n) on average for a balanced tree, but degrade to O(n) in the worst case when the tree becomes skewed (essentially a linked list). Self-balancing variants like AVL trees and red-black trees guarantee O(log n) operations.

Question 4

What is the difference between a stack and a queue?

Answer

A stack follows Last-In-First-Out (LIFO) ordering, where the last element added is the first removed, like a stack of plates. A queue follows First-In-First-Out (FIFO) ordering, where the first element added is the first removed, like a line of people. Stacks are used for function calls, undo operations, and expression evaluation. Queues are used for task scheduling, BFS, and buffering.

Question 5

Explain a heap data structure and its applications.

Answer

A heap is a complete binary tree that satisfies the heap property: in a max-heap, every parent is greater than or equal to its children; in a min-heap, every parent is less than or equal. Heaps support O(log n) insertion and extraction of the min/max element, with O(1) access to the top element. They are commonly used to implement priority queues, heap sort, and algorithms like Dijkstra's shortest path.

Question 6

What is a trie and when would you use one?

Answer

A trie (prefix tree) is a tree-like structure where each node represents a character and paths from root to leaf form complete strings. Lookups, insertions, and deletions are O(m) where m is the string length, independent of the number of stored strings. Tries are ideal for autocomplete, spell checking, IP routing tables, and any application requiring efficient prefix matching.

Question 7

How does a graph differ from a tree?

Answer

A tree is a connected, acyclic graph with exactly one path between any two nodes and a designated root node. A general graph can have cycles, multiple paths between nodes, disconnected components, and no inherent root or hierarchy. Trees have exactly n-1 edges for n nodes, while graphs can have any number of edges. Graphs are represented using adjacency lists or adjacency matrices.

Question 8

What is the difference between BFS and DFS?

Answer

Breadth-First Search (BFS) explores all neighbors at the current depth before moving deeper, using a queue, and guarantees the shortest path in unweighted graphs. Depth-First Search (DFS) explores as deep as possible along each branch before backtracking, using a stack or recursion. BFS uses more memory (O(width of tree)) while DFS uses O(depth). BFS is preferred for shortest paths; DFS for topological sorting, cycle detection, and maze solving.

Question 9

What is a balanced binary tree and why does it matter?

Answer

A balanced binary tree ensures that the height difference between left and right subtrees of any node is bounded (typically by 1), keeping the overall height at O(log n). Without balancing, a BST can degrade to a linked list with O(n) operations. AVL trees maintain strict balance with rotations after each operation, while red-black trees use a slightly relaxed balance for fewer rotations during insertions and deletions.

Question 10

What is an adjacency list vs an adjacency matrix?

Answer

An adjacency list stores each vertex's neighbors as a list, using O(V + E) space and efficient for sparse graphs. An adjacency matrix uses a V x V boolean matrix, consuming O(V^2) space but providing O(1) edge lookups. Adjacency lists are better for most real-world graphs which tend to be sparse, while matrices are preferred for dense graphs or when frequent edge existence checks are needed.

Question 11

What is a doubly linked list and what advantage does it have over a singly linked list?

Answer

A doubly linked list has nodes with pointers to both the next and previous nodes, while a singly linked list only has forward pointers. This allows O(1) deletion of a node when given a reference to it (without needing to traverse from the head to find the predecessor) and enables efficient bidirectional traversal. The trade-off is extra memory per node for the additional pointer and slightly more complex insertion/deletion logic.

Question 12

What is a bloom filter?

Answer

A bloom filter is a space-efficient probabilistic data structure that tests whether an element is a member of a set. It can return false positives (saying an element is present when it is not) but never false negatives (if it says absent, the element is definitely not in the set). It uses multiple hash functions mapping to a bit array, and is used in databases, caches, and network routers to avoid expensive lookups for non-existent keys.

Question 13

How would you implement a LRU cache?

Answer

An LRU (Least Recently Used) cache is best implemented using a hash map combined with a doubly linked list. The hash map provides O(1) key lookups, while the doubly linked list maintains access order with the most recently used item at the head. On access, move the node to the head; on insertion when full, evict the tail node. Both operations are O(1).

Question 14

What is a circular buffer (ring buffer)?

Answer

A circular buffer is a fixed-size array that wraps around, using two pointers (head and tail) to track the start and end of data. When the end of the array is reached, new data wraps to the beginning, overwriting the oldest entries. It provides O(1) enqueue and dequeue operations and is widely used in streaming data, audio buffers, and producer-consumer scenarios where bounded memory is required.

Question 15

What is the difference between a B-tree and a binary search tree?

Answer

A B-tree is a self-balancing tree where each node can hold multiple keys and have multiple children, keeping the tree short and wide. This minimizes disk I/O by reducing the number of nodes visited during lookups, making B-trees ideal for databases and file systems. Binary search trees have at most two children per node and one key, resulting in taller trees that require more traversal steps. B-trees guarantee O(log n) operations and remain balanced by design.