In the world of Information Science, organizing and processing data efficiently is essential. Whether you are searching for a book in a digital library, managing customer records in a bank, or developing software applications, the way data is stored and manipulated can drastically affect performance and usability. This is where data structures and algorithms come into play.
Data structures are specialized formats for organizing and storing data so that it can be accessed and modified efficiently. Algorithms are step-by-step procedures or formulas for solving problems, often involving data stored in these structures.
Understanding these concepts is crucial for competitive exams in Information Science because they form the foundation of problem-solving skills required in computer science and related fields. This chapter will guide you through the fundamental data structures and algorithms, building your knowledge progressively with clear explanations, examples, and visual aids.
Let's begin with two fundamental data structures: arrays and linked lists. Both store collections of elements, but they differ in how they organize memory and allow access.
An array is a collection of elements stored in contiguous memory locations. Think of it as a row of lockers, each holding one item, lined up side by side. Each element in the array can be accessed directly using its index, which starts from 0.
Example: Consider an array of 5 integers representing daily temperatures in °C: [30, 32, 31, 29, 33]. The temperature on day 3 is accessed as array[2] = 31.
Advantages:
Disadvantages:
A linked list is a collection of nodes where each node contains data and a reference (or pointer) to the next node in the sequence. Unlike arrays, linked lists do not require contiguous memory locations.
Imagine a treasure hunt where each clue leads you to the next location. Each node is a clue with information and a pointer to the next clue.
Example: A linked list storing the same temperatures: 30 -> 32 -> 31 -> 29 -> 33, where each arrow represents a pointer to the next node.
Advantages:
Disadvantages:
Next, we explore two important abstract data types: stack and queue. Both are collections with specific rules for adding and removing elements.
A stack follows the Last In, First Out (LIFO) principle. Imagine a stack of plates: you add (push) a plate on top and remove (pop) the top plate first.
Operations:
push(element): Add an element to the top.pop(): Remove and return the top element.peek(): View the top element without removing it.Example: Push 10, then 20, then 30. Pop returns 30 first, then 20, then 10.
A queue follows the First In, First Out (FIFO) principle. Think of a queue at a ticket counter: the first person to arrive is the first served.
Operations:
enqueue(element): Add an element at the rear.dequeue(): Remove and return the front element.front(): View the front element without removing it.Example: Enqueue 5, then 10, then 15. Dequeue returns 5 first, then 10, then 15.
graph TD subgraph Stack A[Push 10] --> B[Push 20] B --> C[Push 30] C --> D[Pop returns 30] D --> E[Pop returns 20] E --> F[Pop returns 10] end subgraph Queue G[Enqueue 5] --> H[Enqueue 10] H --> I[Enqueue 15] I --> J[Dequeue returns 5] J --> K[Dequeue returns 10] K --> L[Dequeue returns 15] end
Moving to more complex structures, trees and graphs allow us to represent hierarchical and networked data.
A tree is a collection of nodes connected in a hierarchy, with one node called the root. Each node may have zero or more child nodes, and nodes with no children are called leaves.
Binary Tree: Each node has at most two children, called left and right.
Binary Search Tree (BST): A binary tree where left child nodes contain smaller values and right child nodes contain larger values than the parent.
Traversal means visiting all nodes in a specific order:
A graph consists of nodes called vertices connected by edges. Graphs can be:
Graphs are used to model complex relationships like social networks, road maps, and communication networks.
Sorting is the process of arranging data in a particular order, usually ascending or descending. Efficient sorting is vital for faster searching and data organization.
Repeatedly swaps adjacent elements if they are in the wrong order. Simple but inefficient for large data.
Selects the minimum element from unsorted part and swaps it with the first unsorted element.
Builds the sorted array one element at a time by inserting elements at their correct position.
A divide and conquer algorithm that divides the list into halves, sorts them recursively, and merges the sorted halves.
Another divide and conquer method that selects a pivot, partitions the array around the pivot, and sorts partitions recursively.
graph TD A[Start with array] --> B[Divide array into two halves] B --> C[Recursively sort left half] B --> D[Recursively sort right half] C --> E[Merge sorted halves] D --> E E --> F[Sorted array]
Searching algorithms help find an element within a data structure.
Checks each element one by one until the target is found or the list ends. Works on unsorted data but inefficient for large data.
Works only on sorted arrays. It divides the search interval in half repeatedly, comparing the middle element to the target.
graph TD A[Start] --> B[Set low=0, high=n-1] B --> C{Is low <= high?} C -- No --> D[Element not found] C -- Yes --> E[Calculate mid = (low+high)/2] E --> F{Is array[mid] == target?} F -- Yes --> G[Return mid] F -- No --> H{Is array[mid] < target?} H -- Yes --> I[Set low = mid + 1] H -- No --> J[Set high = mid - 1] I --> C J --> CTo compare algorithms, we analyze their time complexity (how running time grows with input size) and space complexity (how much extra memory is needed).
Big O notation expresses the upper bound of an algorithm's growth rate, focusing on the dominant term and ignoring constants.
| Algorithm | Best Case | Average Case | Worst Case |
|---|---|---|---|
| Bubble Sort | O(n) | O(n2) | O(n2) |
| Selection Sort | O(n2) | O(n2) | O(n2) |
| Insertion Sort | O(n) | O(n2) | O(n2) |
| Merge Sort | O(n log n) | O(n log n) | O(n log n) |
| Quick Sort | O(n log n) | O(n log n) | O(n2) |
| Linear Search | O(1) | O(n) | O(n) |
| Binary Search | O(1) | O(log n) | O(log n) |
Step 1: Initialize an array of size 5 and a variable top = -1 indicating the stack is empty.
Step 2: push(10): Increment top to 0 and store 10 at array[0]. Stack: [10]
Step 3: push(20): Increment top to 1 and store 20 at array[1]. Stack: [10, 20]
Step 4: pop(): Remove element at array[1] (20), decrement top to 0. Stack: [10]
Step 5: push(30): Increment top to 1 and store 30 at array[1]. Stack: [10, 30]
Step 6: pop(): Remove element at array[1] (30), decrement top to 0. Stack: [10]
Step 7: pop(): Remove element at array[0] (10), decrement top to -1 (empty stack).
Answer: Stack states after operations: [10], [10, 20], [10], [10, 30], [10], empty.
Step 1: Set low = 0, high = 6 (array indices).
Step 2: Calculate mid = (0 + 6) / 2 = 3. Check array[3] = 20.
Step 3: Since 25 > 20, set low = mid + 1 = 4.
Step 4: Calculate mid = (4 + 6) / 2 = 5. Check array[5] = 30.
Step 5: Since 25 < 30, set high = mid - 1 = 4.
Step 6: Calculate mid = (4 + 4) / 2 = 4. Check array[4] = 25.
Step 7: Element found at index 4.
Answer: The element 25 is at position 4 (0-based index).
Step 1: Divide the list into two halves: [38, 27, 43] and [3, 9, 82, 10].
Step 2: Recursively divide [38, 27, 43] into [38] and [27, 43].
Step 3: Divide [27, 43] into [27] and [43].
Step 4: Merge [27] and [43] -> [27, 43].
Step 5: Merge [38] and [27, 43]: Compare 38 and 27 -> 27; then 38 and 43 -> 38; then 43 -> [27, 38, 43].
Step 6: Recursively divide [3, 9, 82, 10] into [3, 9] and [82, 10].
Step 7: Divide [3, 9] into [3] and [9], merge -> [3, 9].
Step 8: Divide [82, 10] into [82] and [10], merge -> [10, 82].
Step 9: Merge [3, 9] and [10, 82]: Compare 3 and 10 -> 3; 9 and 10 -> 9; then 10 and 82 -> 10; then 82 -> [3, 9, 10, 82].
Step 10: Merge [27, 38, 43] and [3, 9, 10, 82]: Compare 27 and 3 -> 3; 27 and 9 -> 9; 27 and 10 -> 10; 27 and 82 -> 27; 38 and 82 -> 38; 43 and 82 -> 43; then 82 -> [3, 9, 10, 27, 38, 43, 82].
Answer: Sorted list is [3, 9, 10, 27, 38, 43, 82].
Step 1: Initialize a queue and a visited set. Enqueue starting node A and mark visited.
Step 2: Dequeue A, visit it. Enqueue neighbors B and C, mark visited.
Step 3: Dequeue B, visit it. Enqueue neighbor D (not visited), mark visited.
Step 4: Dequeue C, visit it. Enqueue neighbors D and E; D already visited, enqueue E and mark visited.
Step 5: Dequeue D, visit it. Enqueue neighbor E; already visited, skip.
Step 6: Dequeue E, visit it. No unvisited neighbors.
Answer: BFS traversal order: A -> B -> C -> D -> E.
for (i = 1; i <= n; i++) { for (j = 1; j <= i; j++) { // constant time operation } } Step 1: Outer loop runs from 1 to n -> n iterations.
Step 2: Inner loop runs from 1 to i for each i.
Step 3: Total operations = \(\sum_{i=1}^{n} i = \frac{n(n+1)}{2}\)
Step 4: Simplify to \(O(n^2)\) ignoring constants and lower order terms.
Answer: Time complexity is \(O(n^2)\).
When to use: When choosing between data structures for memory and speed trade-offs.
When to use: For algorithms like merge sort and binary search.
When to use: To quickly recall traversal orders during exams.
When to use: When estimating algorithm efficiency.
When to use: When learning new algorithms or debugging.
Progress tracking is paywalled — subscribe to mark subtopics as understood and save your streak.
Go to practice →