Data Structures At Work: Hash Tabes, Dictionaries, and Objects
Optimise performance by using the right data structure for storing your data.
Seasons greetings everyone!
I hope you got some time away to spend with family and/or friends.
Today, I’ll be taking a couple of steps backward to address a concept that I skipped when I started this newsletter. When we’ve cleared up our missing past, I will proceed to talk about how key-value pair data structures are the best thing since slice bread and why you should use them whenever you can.
Data Structures and Algorithms
What are data structures?
Data structures refer to the ways in which we can store data. The way we store data then defines how we can access and manipulate that data. When you go to the mall, you stand in a queue. You wait for your turn, because it is “first come, first serve”. Consequently, the cashier can only operate on the first person in line at any point. This is a QUEUE data structure. Data structures are defined by their characteristics. Some characteristics of a queue are:
A queue is defined as a First In First Out (FIFO) data structure.
Retrieval is linear, and starts from the first element.
Easily confused with Queues are Stacks. But do not be confused, just think of them as words and opposite, you just need to remember one CORRECTLY.
When you put away plates, you stack them up. If you do not attempt a magic trick, you can only get the plates at the bottom (which came in first) out if you take out the plates at the top.
If you can take out the 3rd plate from the bottom without moving the plates on top, email me, let’s talk business[we can start a roadshow].
As seen here, stacks are defined by:
Last In First Out (LIFO) structure.
Linear retrieval, starting from the last inserted element.
So, What are Algorithms?
I like to think of algorithms as a series of steps you take (code) to solve a problem. When you write a function that returns the product of two numbers, you have implemented an algorithm. For most of you reading this, I believe you have implemented quite a few algorithms already.
Given we have multiple ways to store data, and that they define how we manipulate and access the data, we need to implement algorithms that will create the data structure and its methods.
This intro section was meant to distinguish between data structures: an idea of how to store and manipulate data AND algorithms: a series of steps you take to solve a problem. In essence, you use algorithms to build data structures.
Key-Value Pair Data Structures: Hash Tables, Dictionaries, and/or Objects
Depending on your preferred language, you either have a HashMap, Dictionary, Object, or Map as a key-value pair data structure. While we will not go into details about hash tables, hash functions, and collisions, we will discuss the advantages of using hash tables and examine and an example using Javascript.
Hash tables are a key-value pair data structure. One key can point to one value, thus why it’s called a table. This has some great advantages, namely one: constant read and write. [Note: We have discussed runtime in a previous newsletter] With a data structure that allows us to read, write, and update in constant time, we can implement faster algorithms to solve problems that would otherwise take more processing. Hash tables are implemented as Dictionaries, Maps, HashMaps, etc depending on your preferred language.
Hash Tables At Work
Let us take a look at two simple examples of how we can use an Object in JavaScript to solve some algorithm problems.
A popular beginner question, and my first algorithm question in an interview, is the two sum problem.
Given an array of numbers n, and a target t, return the two elements that sum up to t.
If you haven’t heard of this, this is a good time to give it a try.
We can do this with a brute-force approach that runs in O(n^2).
function find_sum(sum, array) {
for (let i=0; i<array.length; i++) {
for(let j=0; j<array.length; j++){
if (array[i] + array[j] === sum) return [array[i], array[j]];
}
}
return [-1,-1];
}
Here, we loop through the array, twice, and compare the sum when we add the elements at our current indices. This is almost always the most intuitive answer. Almost.
Questions whose intuitive solution is O(n^2) are often optimisable by using a key-value pair data structure to store the data.
My first interview hint: Think data structure before algorithm. How can I store this data so I can solve this problem efficiently?
Since your data structure will influence your final algorithm, choosing the wrong/right data structure will be the deciding factor in your interview or the efficiency of your solution.
Back to our two sum problem, how do we solve it using a key-value pair?
const find_sum = (target, array) => {
let results = {};
for (let i=0; i<array.length; i++) {
if (results.hasOwnProperty(array[i])) {
return [results[array[i]], array[i]]
}else{
results[target-array[i]] = array[i];
}
}
return [-1,-1];
}
First, we use an object as our key-value pair. The real idea is in the else statement; if we have not seen a number in the array, we add the difference between the target and that number to the results object. If we’ve seen the number at the current index, it means we have seen its counterpart before, thus we return both numbers.
This cuts down our time from O(n^2) to O(n). On the flip side, this also uses O(n) space.
More?
Sure. Why not?
Try this:
Implement a function getPalindrome(str) that takes a string str, and returns a valid palindrome from str, if it exists. Return false if there is no valid palindrome.
getPalindrome(“mmaa“) => maam
getPalindrome(“abb“) => bab
No. Really. Answer it.
I will explain the algorithm so you can try it in your preferred language.
Data Structure
We are using a key-value pair here, WHY?
A palindrome reads the same forward and backward, meaning I need to know how many of each character exists.
I need to be able to find the character with an odd number of appearances easily so I can use that as my middle or pivot for the palindrome.
I can easily use a key-value pair to store and update the count of each character.
Algorithm
A palindrome reads same forward as it does backwards. Easy.
Loop through the string and store the characters and their frequencies in the object or dictionary. {a: 1, b: 2}
Words in general can only be even or odd in length i.e str.length%2 is either 0 0r 1
Odd words will need to have the odd character in the middle for us to have a valid palindrome. E.g bab has 2-as and 1-b. In this case, our valid palindrome has to have “a” in the middle.
Even words do not need to specify a character in the middle.
Add the rest of the characters to both sides of the middle character.
Return the newly formed string.
All done!
Javascript Implementation
const getValidPalindrome = (str) => {
//create key-value pair to map char to frequency
const charHash = {};
let oddChar = ''
for(let i =0; i<str.length; i++) {
if(charHash[str[i]]) {
charHash[str[i]] += 1
}else {
charHash[str[i]] = 1
}
}
//find middle, if it exists
for(let char in charHash) {
if(charHash[char]%2 === 1) {
oddChar = char
charHash[char] --
break
}
}
let result = oddChar
//append chars to both ends
for(let char in charHash) {
while(charHash[char]) {
result = char + result + char
charHash[char] -= 2
}
}
//simple check to see that input and output are same length
if(result.length === str.length) return result
return false
}
getValidPalindrome("bba") => bab
getValidPalindrome(“maama”) => amama
Conclusion
This is the first of a couple of Data Structures at work newsletters that will look at hash tables. Getting comfortable with finding algorithms for hash tables was the turning point in my interviewing skills. Due to its performance improvements, solutions with a hash table data structure, when available, is always preferable.
Also keep in mind, if it is an O(n^2) intuitive solution, there is probably a more efficient solution using a hash table.
Do not forget to subscribe so that I remain alive to send you these awesome newsletters. See you next week when we explore two more questions using hash tables.