👁 Preview — Study, Practice and Revise are open; mock tests and the rest of the syllabus unlock on subscription. Unlock all · ₹4,999
← Back to Information Organization
Study mode

Data structures and algorithms

Introduction

In the world of Information Science, organizing and processing data efficiently is essential. Whether you are searching for a book in a digital library, managing customer records in a bank, or developing software applications, the way data is stored and manipulated can drastically affect performance and usability. This is where data structures and algorithms come into play.

Data structures are specialized formats for organizing and storing data so that it can be accessed and modified efficiently. Algorithms are step-by-step procedures or formulas for solving problems, often involving data stored in these structures.

Understanding these concepts is crucial for competitive exams in Information Science because they form the foundation of problem-solving skills required in computer science and related fields. This chapter will guide you through the fundamental data structures and algorithms, building your knowledge progressively with clear explanations, examples, and visual aids.

Arrays and Linked Lists

Let's begin with two fundamental data structures: arrays and linked lists. Both store collections of elements, but they differ in how they organize memory and allow access.

Arrays

An array is a collection of elements stored in contiguous memory locations. Think of it as a row of lockers, each holding one item, lined up side by side. Each element in the array can be accessed directly using its index, which starts from 0.

Example: Consider an array of 5 integers representing daily temperatures in °C: [30, 32, 31, 29, 33]. The temperature on day 3 is accessed as array[2] = 31.

Advantages:

  • Fast access to elements using indices (O(1) time complexity).
  • Simple and efficient for fixed-size collections.

Disadvantages:

  • Fixed size: once created, the size cannot be changed easily.
  • Insertion and deletion (except at the end) can be costly as elements may need to be shifted.

Linked Lists

A linked list is a collection of nodes where each node contains data and a reference (or pointer) to the next node in the sequence. Unlike arrays, linked lists do not require contiguous memory locations.

Imagine a treasure hunt where each clue leads you to the next location. Each node is a clue with information and a pointer to the next clue.

Example: A linked list storing the same temperatures: 30 -> 32 -> 31 -> 29 -> 33, where each arrow represents a pointer to the next node.

Advantages:

  • Dynamic size: easy to grow or shrink by adding or removing nodes.
  • Efficient insertion and deletion at any position (O(1) if pointer is known).

Disadvantages:

  • Accessing elements is slower (O(n)) because you must traverse nodes sequentially.
  • Requires extra memory for storing pointers.

Memory Layout Visualization

30 32 31 29 33 Array: Contiguous memory 30 32 31 29 33 Linked List: Nodes with pointers

Stacks and Queues

Next, we explore two important abstract data types: stack and queue. Both are collections with specific rules for adding and removing elements.

Stack (LIFO)

A stack follows the Last In, First Out (LIFO) principle. Imagine a stack of plates: you add (push) a plate on top and remove (pop) the top plate first.

Operations:

  • push(element): Add an element to the top.
  • pop(): Remove and return the top element.
  • peek(): View the top element without removing it.

Example: Push 10, then 20, then 30. Pop returns 30 first, then 20, then 10.

Queue (FIFO)

A queue follows the First In, First Out (FIFO) principle. Think of a queue at a ticket counter: the first person to arrive is the first served.

Operations:

  • enqueue(element): Add an element at the rear.
  • dequeue(): Remove and return the front element.
  • front(): View the front element without removing it.

Example: Enqueue 5, then 10, then 15. Dequeue returns 5 first, then 10, then 15.

Operation Flowchart

graph TD    subgraph Stack      A[Push 10] --> B[Push 20]      B --> C[Push 30]      C --> D[Pop returns 30]      D --> E[Pop returns 20]      E --> F[Pop returns 10]    end    subgraph Queue      G[Enqueue 5] --> H[Enqueue 10]      H --> I[Enqueue 15]      I --> J[Dequeue returns 5]      J --> K[Dequeue returns 10]      K --> L[Dequeue returns 15]    end

Trees and Graphs

Moving to more complex structures, trees and graphs allow us to represent hierarchical and networked data.

Trees

A tree is a collection of nodes connected in a hierarchy, with one node called the root. Each node may have zero or more child nodes, and nodes with no children are called leaves.

Binary Tree: Each node has at most two children, called left and right.

Binary Search Tree (BST): A binary tree where left child nodes contain smaller values and right child nodes contain larger values than the parent.

Tree Traversals

Traversal means visiting all nodes in a specific order:

  • Inorder (LNR): Left subtree -> Node -> Right subtree
  • Preorder (NLR): Node -> Left subtree -> Right subtree
  • Postorder (LRN): Left subtree -> Right subtree -> Node

Graphs

A graph consists of nodes called vertices connected by edges. Graphs can be:

  • Undirected: Edges have no direction (e.g., friendship networks).
  • Directed: Edges have direction (e.g., Twitter followers).

Graphs are used to model complex relationships like social networks, road maps, and communication networks.

Graph Traversals

  • Breadth-First Search (BFS): Visits nodes level by level using a queue.
  • Depth-First Search (DFS): Explores as far as possible along each branch before backtracking, using recursion or a stack.

Tree and Graph Structures

15 10 20 8 12 Binary Search Tree A B C D Undirected Graph

Sorting Algorithms

Sorting is the process of arranging data in a particular order, usually ascending or descending. Efficient sorting is vital for faster searching and data organization.

Common Sorting Algorithms

Bubble Sort

Repeatedly swaps adjacent elements if they are in the wrong order. Simple but inefficient for large data.

Selection Sort

Selects the minimum element from unsorted part and swaps it with the first unsorted element.

Insertion Sort

Builds the sorted array one element at a time by inserting elements at their correct position.

Merge Sort

A divide and conquer algorithm that divides the list into halves, sorts them recursively, and merges the sorted halves.

Quick Sort

Another divide and conquer method that selects a pivot, partitions the array around the pivot, and sorts partitions recursively.

Merge Sort Flowchart

graph TD    A[Start with array] --> B[Divide array into two halves]    B --> C[Recursively sort left half]    B --> D[Recursively sort right half]    C --> E[Merge sorted halves]    D --> E    E --> F[Sorted array]

Searching Algorithms

Searching algorithms help find an element within a data structure.

Linear Search

Checks each element one by one until the target is found or the list ends. Works on unsorted data but inefficient for large data.

Binary Search

Works only on sorted arrays. It divides the search interval in half repeatedly, comparing the middle element to the target.

Binary Search Flowchart

graph TD    A[Start] --> B[Set low=0, high=n-1]    B --> C{Is low <= high?}    C -- No --> D[Element not found]    C -- Yes --> E[Calculate mid = (low+high)/2]    E --> F{Is array[mid] == target?}    F -- Yes --> G[Return mid]    F -- No --> H{Is array[mid] < target?}    H -- Yes --> I[Set low = mid + 1]    H -- No --> J[Set high = mid - 1]    I --> C    J --> C

Algorithm Analysis and Big O Notation

To compare algorithms, we analyze their time complexity (how running time grows with input size) and space complexity (how much extra memory is needed).

Big O notation expresses the upper bound of an algorithm's growth rate, focusing on the dominant term and ignoring constants.

Comparison Table of Time Complexities

Algorithm Best Case Average Case Worst Case
Bubble Sort O(n) O(n2) O(n2)
Selection Sort O(n2) O(n2) O(n2)
Insertion Sort O(n) O(n2) O(n2)
Merge Sort O(n log n) O(n log n) O(n log n)
Quick Sort O(n log n) O(n log n) O(n2)
Linear Search O(1) O(n) O(n)
Binary Search O(1) O(log n) O(log n)
{"points": ["Big O notation helps compare algorithm efficiency by focusing on growth rates.","Divide and conquer algorithms like merge sort are efficient for large data.","Binary search is fast but requires sorted data."], "conclusion": "Understanding complexities guides the choice of algorithms for real-world problems."}

Formula Bank

Time Complexity of Binary Search
\[ O(\log n) \]
where: \( n \) = number of elements in the array
Time Complexity of Bubble Sort
\[ O(n^2) \]
where: \( n \) = number of elements to sort
Space Complexity of Merge Sort
\[ O(n) \]
where: \( n \) = number of elements to sort
Big O Notation
\[ f(n) = O(g(n)) \iff \exists c > 0, n_0 > 0 \text{ such that } f(n) \leq c \cdot g(n) \text{ for all } n \geq n_0 \]
where: \( f(n) \) = function representing algorithm complexity, \( g(n) \) = comparison function

Worked Examples

Example 1: Implementing a Stack Using an Array Easy
Implement a stack of integers using an array of size 5. Perform the operations: push 10, push 20, pop, push 30, pop, pop. Show the stack state after each operation.

Step 1: Initialize an array of size 5 and a variable top = -1 indicating the stack is empty.

Step 2: push(10): Increment top to 0 and store 10 at array[0]. Stack: [10]

Step 3: push(20): Increment top to 1 and store 20 at array[1]. Stack: [10, 20]

Step 4: pop(): Remove element at array[1] (20), decrement top to 0. Stack: [10]

Step 5: push(30): Increment top to 1 and store 30 at array[1]. Stack: [10, 30]

Step 6: pop(): Remove element at array[1] (30), decrement top to 0. Stack: [10]

Step 7: pop(): Remove element at array[0] (10), decrement top to -1 (empty stack).

Answer: Stack states after operations: [10], [10, 20], [10], [10, 30], [10], empty.

Example 2: Binary Search on a Sorted Array Medium
Given a sorted array [3, 8, 15, 20, 25, 30, 40], find the position of element 25 using binary search.

Step 1: Set low = 0, high = 6 (array indices).

Step 2: Calculate mid = (0 + 6) / 2 = 3. Check array[3] = 20.

Step 3: Since 25 > 20, set low = mid + 1 = 4.

Step 4: Calculate mid = (4 + 6) / 2 = 5. Check array[5] = 30.

Step 5: Since 25 < 30, set high = mid - 1 = 4.

Step 6: Calculate mid = (4 + 4) / 2 = 4. Check array[4] = 25.

Step 7: Element found at index 4.

Answer: The element 25 is at position 4 (0-based index).

Example 3: Merge Sort on a List of Numbers Medium
Sort the list [38, 27, 43, 3, 9, 82, 10] using merge sort.

Step 1: Divide the list into two halves: [38, 27, 43] and [3, 9, 82, 10].

Step 2: Recursively divide [38, 27, 43] into [38] and [27, 43].

Step 3: Divide [27, 43] into [27] and [43].

Step 4: Merge [27] and [43] -> [27, 43].

Step 5: Merge [38] and [27, 43]: Compare 38 and 27 -> 27; then 38 and 43 -> 38; then 43 -> [27, 38, 43].

Step 6: Recursively divide [3, 9, 82, 10] into [3, 9] and [82, 10].

Step 7: Divide [3, 9] into [3] and [9], merge -> [3, 9].

Step 8: Divide [82, 10] into [82] and [10], merge -> [10, 82].

Step 9: Merge [3, 9] and [10, 82]: Compare 3 and 10 -> 3; 9 and 10 -> 9; then 10 and 82 -> 10; then 82 -> [3, 9, 10, 82].

Step 10: Merge [27, 38, 43] and [3, 9, 10, 82]: Compare 27 and 3 -> 3; 27 and 9 -> 9; 27 and 10 -> 10; 27 and 82 -> 27; 38 and 82 -> 38; 43 and 82 -> 43; then 82 -> [3, 9, 10, 27, 38, 43, 82].

Answer: Sorted list is [3, 9, 10, 27, 38, 43, 82].

Example 4: Graph Traversal Using BFS Hard
Perform a breadth-first search (BFS) traversal starting from node A on the following undirected graph: A connected to B and C; B connected to D; C connected to D and E; D connected to E.

Step 1: Initialize a queue and a visited set. Enqueue starting node A and mark visited.

Step 2: Dequeue A, visit it. Enqueue neighbors B and C, mark visited.

Step 3: Dequeue B, visit it. Enqueue neighbor D (not visited), mark visited.

Step 4: Dequeue C, visit it. Enqueue neighbors D and E; D already visited, enqueue E and mark visited.

Step 5: Dequeue D, visit it. Enqueue neighbor E; already visited, skip.

Step 6: Dequeue E, visit it. No unvisited neighbors.

Answer: BFS traversal order: A -> B -> C -> D -> E.

Example 5: Calculating Time Complexity of Nested Loops Hard
Analyze the time complexity of the following code snippet:
    for (i = 1; i <= n; i++) {      for (j = 1; j <= i; j++) {        // constant time operation      }    }    

Step 1: Outer loop runs from 1 to n -> n iterations.

Step 2: Inner loop runs from 1 to i for each i.

Step 3: Total operations = \(\sum_{i=1}^{n} i = \frac{n(n+1)}{2}\)

Step 4: Simplify to \(O(n^2)\) ignoring constants and lower order terms.

Answer: Time complexity is \(O(n^2)\).

Tips & Tricks

Tip: Remember that arrays have fixed size but provide O(1) access, while linked lists allow dynamic size but O(n) access.

When to use: When choosing between data structures for memory and speed trade-offs.

Tip: Use the 'divide and conquer' approach to simplify complex problems like sorting and searching.

When to use: For algorithms like merge sort and binary search.

Tip: For tree traversals, use mnemonic 'NLR' for preorder, 'LNR' for inorder, and 'LRN' for postorder.

When to use: To quickly recall traversal orders during exams.

Tip: In Big O analysis, focus on the highest order term and ignore constants for asymptotic behavior.

When to use: When estimating algorithm efficiency.

Tip: Practice dry-running algorithms with small input sets to understand their behavior before coding.

When to use: When learning new algorithms or debugging.

Common Mistakes to Avoid

❌ Confusing stack (LIFO) with queue (FIFO) behavior.
✓ Recall stack operations are push/pop from the same end, queue operations are enqueue at rear and dequeue at front.
Why: Terminology and operation order can be confusing without practice.
❌ Assuming binary search works on unsorted arrays.
✓ Binary search requires a sorted array to function correctly.
Why: Lack of understanding of algorithm prerequisites.
❌ Ignoring base cases in recursive algorithms leading to infinite recursion.
✓ Always define and test base cases to terminate recursion.
Why: Forgetting base cases is a common oversight.
❌ Misinterpreting Big O notation by including constants and lower order terms.
✓ Focus only on the dominant term and ignore constants in asymptotic analysis.
Why: Misunderstanding the purpose of Big O notation.
❌ Mixing up traversal orders in trees.
✓ Use mnemonics and practice to differentiate preorder, inorder, and postorder traversals.
Why: Similar names and sequences cause confusion.
Curated videos per subtopic
Top YouTube explainers, AI-ranked for your exam and language. Unlocks with subscription.
Unlock

Try Practice next.

Progress tracking is paywalled — subscribe to mark subtopics as understood and save your streak.

Go to practice →
Ask a doubt
Data structures and algorithms · 10 free messages
Ask me anything about this subtopic. You have 10 free messages this session — chat history isn't saved in preview.