Fall 2024 CSE 317 Algorithm Questions
Fall 2024 CSE 317 Algorithm Questions
The shortest common superstring problem is NP-complete because it involves finding the shortest string that contains each of a given set of strings as a substring, a task for which there is no known polynomial-time solution. The challenge lies in needing to determine the best way to overlap the strings to minimize the total length, which requires examining potentially exponential combinations of how these strings can be concatenated. This problem is analogous to the traveling salesman problem where one seeks the shortest route visiting all points .
To design a parallel algorithm using n processors to compute the sequence of partial sums for a sequence X = ⟨x1, x2, . . . , xn⟩, we can utilize the parallel prefix sum algorithm. The idea is to use a binary tree structure where each processor is assigned an element xi. At each level of the tree, processors sum their assigned elements with the elements assigned to the processors they are interacting with. This operation continues until the root of the tree is reached, resulting in the complete prefix sum at each processor. The time complexity of this operation is O(log n), as the depth of the binary tree is logarithmic with respect to the number of elements .
To find the missing integer in an unsorted array of n elements containing all but one integer from 1 to n+1, we can use the formula for the sum of the first n+1 numbers: S = (n+1)(n+2)/2. Calculate the sum of the array's elements, and then subtract this sum from S. The result will be the missing integer. This approach has a time complexity of O(n) since it involves a single pass through the array to compute its sum .
A randomized algorithm to find an element in the upper half of a sorted set could involve randomly selecting elements and testing if they fall in the desired half, stopping once a sufficient sampling yields a valid element. The probability of success with each sample is 0.5, and repeating the process log(n) times ensures with high probability (1 - 1/n) that an upper-half element is found due to repeated trials and the probabilistic guarantee .
To find two elements in an array with a specific difference x efficiently, use a hash set to store elements encountered so far. For each element ai in the array, check if ai + x or ai - x exists in the set. If either exists, the pair is found. This approach has a time complexity of O(n), as it involves a single iteration over the array and constant-time hash set operations for additions and lookups .
To minimize scalar multiplications in matrix chain multiplication using dynamic programming, define a table where entry (i, j) represents the minimum number of multiplications needed to compute the product of matrices from Mi to Mj. The table is filled by considering all possible ways to parenthesize the product, computing the cost for each split, and storing the optimal values based on previously computed results. The solution emerges from building the table starting with the smallest subproblems and progressively solving larger ones. This results in a time complexity of O(n^3) due to nested loops computing the minimal product cost .
Kruskal’s algorithm constructs a minimum spanning tree (MST) by sorting edges by weight and adding the next smallest edge to the growing MST, ensuring it does not form a cycle. If all edges have unique costs, Kruskal’s algorithm ensures that the MST is unique, as there is no ambiguity in selecting edges. When all edge costs are uniform, multiple spanning trees could have the same total weight, allowing multiple MST configurations. This approach highlights how edge uniqueness ensures deterministic MST construction, while uniformity results in non-unique possibilities .
Dynamic programming is a technique used to solve problems by breaking them down into simpler overlapping subproblems, storing the solutions to these subproblems to avoid redundant calculations. Divide and conquer involves breaking a problem into independent subproblems, solving each separately, and combining their results. When applying these techniques to compute the Fibonacci sequence: Dynamic programming uses a bottom-up approach, storing the results of each Fibonacci number as it builds up from the base cases, resulting in an O(n) time complexity. Divide and conquer, as applied naïvely to Fibonacci, results in exponential time complexity due to repetitive calculations without storing results .
Counting inversions in an array using divide and conquer involves splitting the array into two halves, recursively counting the inversions in each half, and then counting inversions caused by pairs split between the two halves during merging. The merge step is similar to that of merge sort and combines the results while counting cross-inversions efficiently. The Master Theorem supports the O(n log n) time complexity, given T(n) = 2T(n/2) + O(n), as it describes the recurrence relations typical for divide and conquer algorithms where merging is linear .
To efficiently find the number of strongly connected components in a directed graph, use Kosaraju’s or Tarjan’s algorithm. Both algorithms work in O(n + m) time, where n is the number of vertices and m is the number of edges. Kosaraju's algorithm involves two DFS traversals: one to order vertices and another on the transposed graph to identify components. Tarjan’s uses a single DFS with low-link values to discover components. Both require linear time relative to the size of the graph .