Three months of practicing binary search trees on paper. Search procedure memorised. Insertion procedure memorised. Three deletion cases memorised. Then came the onsite. The interviewer drew a tree on the whiteboard, asked for an insertion walkthrough, and the recall worked fine. Then she asked for a deletion of a node with two children, and the freeze lasted almost a full minute before the inorder successor steps came out without any real understanding of why they preserved the ordering. She let the explanation finish, then asked one followup: why does the inorder successor specifically work? There was no answer beyond "because the textbook says so." That was the moment the gap became obvious. The months spent memorising procedures had skipped the part the interview was actually testing.
What changed afterwards:
- BST search, insert, and delete are not three procedures. They are the same recursive walk with three different terminal conditions.
- The two child deletion case is the one interviewers probe because it is the only case where the invariant has to be explained out loud.
- Memorising the inorder successor trick doesn't transfer. Being able to say why the inorder successor specifically works does.
- Once the invariant becomes the unit of memory, every operation that preserves it becomes derivable on the spot.
Where the approach was breaking
What had been getting practiced wasn't the data structure. It was a list of procedures the invariant could have handed over for free.
The recall pattern looked like this. Open a BST problem. Recognise it as a BST problem. Try to recall the right procedure for whatever the question asked. Search? Walk down comparing target to current. Insert? Same walk, create a leaf at the bottom. Delete? Three cases, pick the matching one, run the steps. The recall worked fine for problems that matched the textbook exactly. It fell apart the moment the problem deviated, because there was no deeper rule available to derive a new procedure from.
The specific failure on that first interview was treating the two child deletion case as a separate procedure rather than a question about the invariant. The procedure said "find the inorder successor, copy its value, delete the original successor." Those steps executed fine. What didn't execute was the reason behind them, which was the actual signal the interviewer was checking.
The gap is small in concept and load bearing in practice. If you can recite the steps but can't explain why each one preserves the rule, you will lose the thread whenever the interview adds a constraint that doesn't appear in the textbook version. And interviews almost always add a constraint.
What started working differently
After that interview, the practice rebuild was small in shape and large in effect. Instead of "memorise the procedure, then run it," the new approach was "start from the invariant, then derive what has to happen to preserve it." Same problems, same operations, completely different relationship to the material.
The contrast looked like this:
| Before (procedure first) | After (invariant first) |
| Memorise search as a walk down comparing values | Start from left < node < right and derive the walk as the consequence |
| Memorise insert as "find null pointer, create leaf" | Notice that insert is the same walk, only the terminal action changes |
| Memorise three deletion cases | Ask "what value can sit in this slot without breaking the invariant?" |
| Recall the inorder successor as a step | Derive the inorder successor as the answer to the slot question |
| Stall on novel problems | Answer novel problems by reasoning forward from the rule |
The shift was to remember fewer things, more carefully. There is exactly one rule for a BST: every node's left subtree contains only smaller values, every right subtree contains only larger values. Every operation either preserves the rule or is incorrect. That framing made the procedures fall out naturally, because the question "what do I do here?" reduced to "what action keeps the rule intact?"
What that looked like on the next problem
A month later came a phone screen at a different company. The interviewer gave a tree with [8, 3, 10, 1, 6, 14] and asked for a search for 6 and then an insertion of 7.
Search started at the root. 6 < 8, so go left to 3. Then 6 > 3, so go right to 6. Found it.
def search(node, target):
if node is None or node.value == target:
return node
if target < node.value:
return search(node.left, target)
return search(node.right, target)
Then inserting 7 was the same walk. From the root, 7 < 8, go left. Then 7 > 3, go right to 6. Then 7 > 6, go right again. Right pointer is null. Create node 7 there.
def insert(node, value):
if node is None:
return Node(value)
if value < node.value:
node.left = insert(node.left, value)
elif value > node.value:
node.right = insert(node.right, value)
return node
What was different this time was the framing said out loud after the code finished. Search and insert are the same recursive walk. Search stops when the value matches or the pointer is null. Insert stops at a null pointer and writes a new leaf there. The walk is what the invariant requires. The terminal action is the only thing that changes between operations. The interviewer asked a few more questions and then moved on. The framing was the part that mattered, not the code.
The same shift made deletion easier too. Instead of three cases, the new mental model was one question: what value can sit in this slot without breaking left < node < right? For a leaf, the answer is no value. Drop the pointer. For a one child node, the answer is the child's value, because the child's subtree already satisfies the invariant against the parent. For a two child node, the answer is the smallest value larger than the deleted node (the inorder successor), which is what the slot requires after deletion. Three procedures collapsed into three answers to the same question.
Why this works (the invariant doing the heavy lifting)
The reason the invariant first framing holds up under interview pressure has a name in learning science: near transfer versus far transfer. Memorising a procedure builds a near transfer skill. The procedure applies to problems that look exactly like the one that got practiced. It doesn't apply to problems that look slightly different, because the procedure is tied to the specific form of the original.
Deriving a procedure from a rule builds a far transfer skill. The rule applies to any problem that respects the rule, regardless of how the form changes. The rule is what generalises. The procedure is what gets generated for the specific case.
BSTs make this concrete because there is exactly one rule and there are dozens of variant problems that ask you to apply it. BST validation. Conversion to sorted array. Conversion to sorted doubly linked list. Range queries. Kth smallest element. Lowest common ancestor in a BST. BST iterator. All derivable from "left subtree smaller, right subtree larger." None easy to memorise as individual procedures, because there are too many of them and they shade into each other.
The inorder successor lesson walks the two child deletion case from the invariant down, which is the framing that most first practice resources skip. The procedure becomes the easy part once the question is framed as "what value belongs in this slot?"
What I'd tell someone in the same spot
If your BST practice is at the point where the procedures feel memorised but the mock interviews aren't going well, the shift to make is to stop quizzing yourself on the steps and start quizzing yourself on the rule.
A practice protocol that worked for the rebuild:
- Before solving any BST problem, write the invariant at the top of the page.
left < node < right for every node, applied recursively to every subtree. Look at it for ten seconds.
- Read the problem and ask: what action does the invariant require here? Not "what procedure do I run." What does the rule say has to happen.
- Trace one example by hand on paper, narrating each step as "this preserves the rule because..." If that sentence won't finish, what's running is a memorised procedure, not a derivation from the rule.
- Code it up. If the code falls out naturally from the narration, you have it. If a procedure shows up that you can't justify, stop and go back to step 2.
The protocol slows things down at first. Adding a thirty second pause to write the invariant and narrate the reasoning feels expensive under time pressure. The payoff is that the slow practice transfers. When the interview adds a constraint, the procedure that would have been recalled is wrong, but the rule still applies and the new procedure derives in real time.
That gap, between solving a BST problem and recognising why an operation works, is the thing that pushed me to build Codeintuition. Watching engineers grind through hundreds of tree problems without the invariant first framing was the pattern that wouldn't stop showing up. The tree courses now lead with the rule and let the procedures fall out, which is the ordering that would have saved months on the first attempt.
Originally posted on my own blog, a slightly longer version with the inorder traversal proof and the balanced construction algorithm.
What is a data structure that started making sense for you only after you stopped memorising the operations and started re-deriving them from the rule?