Hi everyone,
I’ve really been having fun with AtCoder lately. In a previous post, I talked about the AtCoder Beginner Contest (ABC). They run a lot of these contests. Do check them out if you have time; I’ve found that at least one or two problems in each contest tend to align well with what we’re doing here.
Today, we’ll take a look at the Leaping Tak problem from AtCoder Beginner Contest 179.
There is a line of cells numbered $1$, $2$, …, $n$. Tak is in cell $1$ and wants to get to cell $n$, and we want to know the total number of ways that he can do this.
Tak is restricted in the moves that he can make. First, he can only move to the right; moving left is not allowed. Second, he is required to jump only distances that are within one of the provided jump distance intervals.
For example, suppose that there are two jump distance intervals: 2-4 and 1-1. The first interval allows Tak to jump two, three, or four cells to the right; the second allows Tak to jump one cell to the right. So, those are the four valid moves here: jump two cells, jump three cells, jump four cells, jump one cell. No other jump is allowed.
A restriction given in the problem description is that jump intervals cannot overlap. For example, you’ll never see the jump intervals 1-4 and 2-5 together, because they overlap (they both contain the jump distances 4 and 5).
In how many ways can Tak get from cell $1$ to cell $n$ using the allowed jump distances?
The first line of input contains two integers: the number $n$ of cells and the number $k$ of nonoverlapping jump distance intervals. $n$ is between 2 and 200000; $k$ is between 1 and 10.
Each of the next $k$ lines contains the information for one jump distance interval. The format for each of these lines is two integers: the first gives the minimum jump distance in the interval (greater than or equal to 1) and the second gives the maximum jump distance in the interval (less than or equal to n).
Output the number of ways that Tak can get from cell $1$ to cell $n$, mod 998244353.
(What’s up with the mod 998244353 here? The reason is that without this the solution could become massive, far bigger than what can be represented by any integer data type. Using mod in this way keeps the solution size under control. For more about this, see the Visiting Grandma example in Chapter 5; there, we used mod in a similar way to keep the reported number of shortest paths manageable.)
The time limit for solving the test cases is two seconds.
Just to make sure everything’s clear, let’s see a small test case:
5 2
2 4
1 1
Here we have five cells and two jump distance intervals. The first jump distance interval is 2-4; the second is 1-1.
The correct answer here is 8, which means that there are 8 ways for Tak to get from cell 1 to cell 5.
Can you figure out what those 8 ways are?
Here they are:
This is a problem where we’re asked to calculate the total number of ways that something can happen; specifically, the number of ways that Tak can get from cell $1$ to cell $n$. The closest example in the book is the Ways to Pass problem from Chapter 3. The closest example so far on this blog is the Ancient Manuscript problem. In all of these problems, the pattern is the same: we look at all possible final moves that we could make, and we add to our total for each one.
I’m going to break the code down into two parts: first, the main
function that reads the input, and then the solve
function that actually solves the problem.
Here’s the main
function:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#define N 200000 // Max number of cells
int main(void) {
int n, k, i, j, start, end, res;
static int jump_distances[N + 1];
static int memo[N + 1];
scanf("%d%d", &n, &k);
for (i = 0; i < k; i++) {
scanf("%d%d", &start, &end);
for (j = start; j <= end; j++)
jump_distances[j] = 1;
}
for (i = 1; i <= n; i++)
memo[i] = -1;
res = solve(jump_distances, n, memo);
printf("%d\n", res);
return 0;
}
The key data structure here is the jump_distances
array (5). If jump_distances[x]
is 0
, then it means that x
is not a valid jump distance; if jump_distances[x]
is 1
, then x
is a valid jump distance.
The code sets up this array by looping through each jump distance interval (10), and then setting each jump distance in the interval to 1
(13).
The final thing this code does is call the solve
function (19), whose code we’ll see next. Notice that I’m planning to use memoization here, as signaled by the third argument.
Now here’s the solve
function – the function that uses memoization to solve the problem.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#define MOD 998244353
int solve(int jump_distances[], int n, int memo[]) {
int i, total;
if (n <= 0)
return 0;
if (n == 1) {
memo[n] = 1;
return memo[n];
}
if (memo[n] != -1)
return memo[n];
total = 0;
for (i = 1; i < n; i++)
if (jump_distances[i])
total = (total + solve(jump_distances, n - i, memo)) % MOD;
memo[n] = total;
return memo[n];
}
How many ways are there to get to cell n
? To answer that question, we go through each possible jump distance from 1 to n - 1
(16), and we ask: is this a valid jump distance (17)? If it is, then we have new ways to get to cell n
; namely, get to cell n - i
and then jump from there to cell n
. The number of new ways is the number of ways we can get to cell n - i
(18), because we can extend each of those using jump distance i
to give us a way to get to cell n
.
Don’t forget to mod the result (18)! This doesn’t generally affect our solution to the kinds of test cases that we often construct by hand, but leaving this out will certainly cause us to fail some of the official test cases.
We just developed some pretty clean code, eh? We’ve taken into account all possible final moves. For each, we used recursion+memoization to find the number of ways to solve the smaller subproblem that doesn’t include the final move. And we added up the solutions to all of these subproblems to solve the original problem. Sounds like textbook memoization/dynamic programming to me.
Go ahead. Submit the code to AtCoder.
If you do
…
You’ll find that
…
It isn’t good enough! Here, my code is timing out on 6 of the test cases.
Ugh. Looks like we have some inefficiency that we need to take care of before we’re done with this one!
The material in Chapter 3 of the book prepares you to get to this point. So, if you came up with this kind of solution, then you’re on track. But how can we finish up? Here’s a hint: what we’ll need can be found in Chapter 6 (where we end up revisiting dynamic programming!).
The key that makes this problem a little easier than we might fear, and the source of the gains in efficiency we’ll be able to realize, is in the number of jump distance intervals. There can be at most ten of them. That’s a very small number! If we could somehow deal with each of these intervals in a small number of steps, then we’d be done. The problem right now is that one of these intervals could cause us to do a huge amount of work. For example, check out this test case:
50000 1
100 48000
Just one interval here, but it takes more than five seconds to run on my laptop. Why? Because, yes, it’s just one interval, but it’s huge, inducing our code to test a huge number of jump distances. For example, as the final move in the sequence, our code will test 47901 jump distances; solving for each of the resulting subproblems will require testing thousands of their own jump distances.
We need a way to handle each jump distance interval much more quickly. And we will do that in the next post. Stay tuned!
Written on December 30th, 2020 by Daniel Zingaro