Coding interview cheat-sheet
Interview tips
- Ask questions about the problem to clarify it.
- Don’t jump into the code, think and discuss.
- Don’t stay muted for a while, clarify when you are thinking, “Can I think for a second?”.
- Think out loud.
- Multiply the examples.
- List the corner cases.
- Ask for approval before coding, “Does it seem like a good strategy ?” “Should I start coding ?”
Reflexes
- Can sorting input data help me ?
- Can splitting input data help me ?
- Can a dictionary help me ?
- Can multiple pointers help me ?
- Can a frequency array help me ?
- Can multiple pass help me ?
- Can case-based reasoning help me ?
- Is there an ends here/with property ?
Tricks
- Checking if $i \in Stack$ in $O(1)$: just keep updated an “is_in” array
is_in_stack
whereis_in_stack[i]
tells whether $i \in Stack$. - Sets with fewer than 64 possible items can be coded as integers.
- For singly linked lists, creating 2 pointers $n$ nodes apart may be handy. It is also applicable to every sequence of elements. (find the middle / $n^{th}$ from the end, a cycle…)
- Use infinite boundaries
float('inf')
ormath.inf
. - Use frequency arrays when counting letters. A counting sort might also prove useful (as it is linear for frequencies) later on.
- To find $a$ and $b$ in a array so that $a+b = c$, save every seen number in a set $S$ and for every new $i$, check if $c-i \in S$.
Properties to look for
-
sorted ➜ Bisection search
-
iterative DFS ➜ stack
-
BFS ➜ queue
- Need to keep track of the depth? → 2 queues
-
Optimal substructure ➜ Dynamic programming to compute all the optimal solution of size $0 \ldots n$ and deduce the solution.
- when it is possible to craft the optimal solution for an instance size $n$ using the solution to instances of size $i_1, \ldots, i_p < n$.
- Can be on a pair of parameters, solution of size $(n, k)$ can be deduced from solution of size $(i, j)$ where $i \leq n$ and $j \leq k$.
Examples: subarray, knapsack, longest substring
Strategies
Do It Yourself
Solve an example with your brain and hands and see what “algorithm” you used.
B.U.D.
Bottlenecks, Unnecessary work, Duplicated work Walk through your best algorithm and look for these flaws.
Space/Time tradeoffs
Can I save time by using more space ?
- Hash tables
- Frequency arrays
- Dynamic programming
Data Structures
-
Stack
- $O(1)$:
l = []
,l.append()
,l.pop()
,l[i]
,len(l)
- $O(n)$:
del l[i]
,l[inf:sup:step]
- $O(nlog_2(n))$:
l.sort()
,sorted(l)
- $O(1)$:
-
Queue
- $O(1)$:
dq = deque()
,dq.popleft()
,dq.appendleft(x)
+ normal list operations except slices
- $O(1)$:
-
Dictionary / Hashtable
- $O(1)$:
dic = {}
,key in dic
,dic[key]
,del dic[key]
on average, worst case $O(n)$
- $O(1)$:
-
Set
- $O(1)$:
s = set([])
,x in s
,s.add(x)
,del s[x]
- $O(n_1)$:
s1|s2
,s1&s2
,s1-s2
,s1^s2
- $O(1)$:
-
Heap / Priority Queue
- $O(1)$:
h = []
- $O(log_2(n))$:
heappush(h, x), heappop(h)
- $O(nlog_2(n) )$:
heap = heapfy([1, 5, 2, 3...])
left = 2*i; right = left+1; father = i//2
ifh[0]
in a 1-based arrayleft = 2*i+1; right = 2*i+2; father = depends on parity
otherwise
- $O(1)$:
-
Union find
- to be continued
Recursion
def f():
# Base case
...
# Recurisve calls
...
Bisection search
Bisection search ends when left
and right
cross.
def bisect(xs, key, f):
left, right = 0, len(xs)
while left < right:
mid = left + (right - left) // 2 # avoid integer overflow (unnecessary in python)
if f(xs[mid], key): left = mid + 1 # Move right
else: right = mid # Move left
return left
When will they cross ?
f(x, key) = x < key
➜ Moveright
whenx == key
- Index of the first occurrence of key
f(x, key) = x <= key
➜ Moveleft
whenx == key
- Index of the last occurrence of key + 1
bisect_left = lambda xs,key: bisect(xs, key, lambda x,y: x<y) # First occ
bisect_right = lambda xs,key: bisect(xs, key, lambda x,y: x<=y) - 1 # Last
- Might be necessary to check if the final value points to key
Sorting algorithms
Merge sort
- Split the list in 2
- Recursively sort the left-most part and right-most part
- Merges these parts
def merge(t, i, j, k, aux):
# merge t[i:k] in aux[i:k] assuming t[i:j] and t[j:k] are sorted
a, b = i, j # iterators
for s in range(i, k):
if a == j or (b < k and t[b] < t[a]):
aux[s] = t[b]
b += 1
else:
aux[s] = t[a]
a += 1
def mergeSort(t):
aux = [None] * len(t)
def mergeRec(i, k):
# merge sort t from i to k
if k > i + 1:
j = i + (k - i) // 2
mergeRec(i, j)
mergeRec(j, k)
merge(t, i, j, k, aux)
t[i:k] = aux[i:k]
mergeRec(0, len(t))
Counting sort
- Count the occurences of each possible values
- Put the number of occurence times each value in the array, starting from the smallest one
def countingSort(array, maxval):
"""in-place counting sort
O(n + maxval)"""
m = maxval + 1
# count the occurence of every possible value
count = [0] * m # /!\
for a in array:
count[a] += 1
i = 0
for a in range(m):
for _ in range(count[a]):
array[i] = a
i += 1
return array
Dynamic programming
Simple steps
- Recursive solution
- Store intermediate results
- Bottom up (Optional)
- Consider adding dummy values because things like
res[x][y]
may throw index errors on corner cases. - It can be easier to find the size of the best solution and then backtrack to find the solution.
Recursive solution
#-- 1 Naive recursive solution
def f():
# Base case
...
# Recurisve calls
...
Memoization
@lru_cache(maxsize=None)
can automate the memoization process by using a dictionary to store the result of function calls (less efficient than an array).
#-- 2 Adding the memoization
# a slot for every possible parameters
res = [[-1] * len(X) for _ in range(len(Y))]
# @lru_cache(maxsize=None) built-in auto-memoization
def f(x, y):
# Base case
...
# Already seen case
if res[x][y] != -1:
return res[x][y]
# Recurisve calls
... # Just update the res[x][y] slot
return res[x][y]
Bottom up
#-- 3 Bottom up
def f(X, Y):
res = [[-1] * len(X) for _ in range(len(Y)) # pre-allocate memory, unnecessary in python
for x in range(1, len(X)): # skipping dummy value
for y in range(1, len(Y)):
... # update the res[x][y] slot
return res[-1][-1]
int to char
chr(65) # -> 'A'
ord('A') # -> 65
chr(97) # -> 'a'
ord('a') # -> 97
Base conversion
def basebTo10(x, b=2): # Horner scheme
u = 0
for a in x:
u = b * u + a
return u
def base10Tob(q, b=2):
s = ''
while q > 0:
q, r = divmod(q, b)
s = str(r) + s
return s
# when our base is a power of 2
def base10ToPowOf2(q, pow=1):
s = ''
while q > 0 or not s:
q, s = q >> pow, str(q & pow) + s
return s
#-- When there is no 0 in our base
# for instance counting excel columns
def base10TobNoZero(q, b=2):
s = ''
while q > 0 or not s:
q, r = divmod(q - 1, b) # -1 to remove 0
s = chr(65 + r) + s
return s
Bitwise operators
Operation
-
x << y
Returns x with the bits shifted to the left by y places (and new bits on the right-hand-side are zeros). This is the same as multiplying x by 2**y. -
x >> y
Returns x with the bits shifted to the right by y places. This is the same as //‘ing x by 2**y. -
x & y
Does a “bitwise and”. Each bit of the output is 1 if the corresponding bit of x AND of y is 1, otherwise it’s 0. Commutative and Associative. -
x | y
Does a “bitwise or”. Each bit of the output is 0 if the corresponding bit of x AND of y is 0, otherwise it’s 1. Commutative and Associative. -
~ x
Returns the complement of x - the number you get by switching each 1 for a 0 and each 0 for a 1. This is the same as -x - 1. -
x ^ y
Does a “bitwise exclusive or”. Each bit of the output is the same as the corresponding bit in x if that bit in y is 0, and it’s the complement of the bit in x if that bit in y is 1. Commutative and Associative.
Coding sets on bits
Efficient when there is less than 64 possible values
Set | Binary |
---|---|
$\emptyset$ | 0 |
${i}$ | 1 << i |
${0, 1, \ldots, n-1}$ | (1 << n) - 1 |
$A \cup B$ | `A |
$A \cap B$ | A & B |
$(A \setminus B) \cup (B \setminus A)$ | A ^ B |
$A \subseteq B$ | A & B == A |
$i \in A$ | (1 << i) & A |
${min(A)}$ | -A & A |
Algorithms
You should ideally now about:
(Most are overkill for an interview but nice to know as a software engineer)
Numeric/mathematical
- Kadane’s algorithm (maximum subarray)
- every subarray has an ending
- the maximum subarray ending at the spot
i + 1 is either
- maximum subarray ending at the spot
i
+A[i + 1]
A[i + 1]
- maximum subarray ending at the spot
- Boyer-Moore majority vote algorithm
- find a candidate for the majority element
- keep a counter of the number of occurence of the current candidate while iterating through the list
++
if the current element is the candidate--
otherwise- if counter reaches 0, candidate = current candidate
- check if the candidate is the majority element
- find a candidate for the majority element
- Quickselect / Median-of-medians
- Reservoir sampling
- Sieve of Eratosthenes
- Alias method
- Euclid’s algorithm
- if
b != 1
return =gcd(b, a % b)
- else return a
- if
- Exponentiation by squaringMatrix exponentiation
- Just square exponentiation with matrix
res = res * a
if n odda = a * a
otherwise
- Matrix exponentiation
- use exponentiation by squaring
- Range minimum query
Tree/graph
- BFS
- stack or recursion
- DFS
- queue
- Dijkstra’s algorithm
- A* search
- Toposort
- Morris traversal
- Prim’s algorithm
- Kruskal’s algorithm
- Bellman-Ford algorithm
- Graham scan algorithm
- Ford-Fulkerson algorithm / Edmunds-Karp algorithm
- Floyd’s cycle finding algorithm
- Closest pair of points algorithm
- Hopcroft-Karp algorithm
String
- Longest common subsequence
lcs[p][q]
= lcs ending at indexp
in str1 andq
in str2 (DP)lcs[p][q]
=str1[p] == str2[q] ? 1 + lcs[p-1][q-1] : max(lcs[p][q-1], lcs[p-1][q])
- Rabin-Karp algorithm
- Knuth-Morris-Pratt algorithm
- Aho-Corasick algorithm
- Ukkonen’s algorithm / SA-IS algorithm
- Manacher’s algorithm
Search
- Bisection search
- Interpolation search
- Meta binary search
Sorting
- Merge sort
- Heap sort
- Quicksort + Three way partitioning + Median of three
- Dutch national flag algorithm
Other
- Fisher-Yates shuffle
- Shunting-yard algorithm
- Minimax
- Mo’s algorithm
- Square root decomposition
- Rolling hash
Reverse interview
-
Questions about the stack, the tests, the integration process, documentation of production incidents…
-
Standard dev environment ? Mandatory ?
-
What are the weekly meetings ?
-
What’s the remote policy ?
-
What’s the product schedule and deployment frequency ?
-
What about conference/event attending ?
-
Is there any dedicated time for self-training ? Contributing to FOSS ?
-
What does it take to be successful here ?
To evaluate the global software engineering quality using the joel (StackOverflow co-founder) test:
- Do you use source control?
- Can you make a build in one step?
- Do you make daily builds?
- Do you have a bug database?
- Do you fix bugs before writing new code?
- Do you have an up-to-date schedule?
- Do you have a spec?
- Do programmers have quiet working conditions?
- Do you use the best tools money can buy?
- Do you have testers?
- Do new candidates write code during their interview?
- Do you do hallway usability testing?