It’s that time of the year again! 🎄
I learned about Advent of Code a couple of years ago, when Caio challenged himself to complete every puzzle and post about it.
Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like. People use them as interview prep, company training, university coursework, practice problems, a speed contest, or to challenge each other. – Advent of Code
I’ve tried to solve some of them, but this time of the year is usually so hectic that I don’t think I ever got past day 6. Maybe this is the year I get to the second week? (Nevermind that I’m already late.)
I still have a full week ahead of exams and papers, so I’ll probably take it slow, but I wanted to lay the groundwork. I’m starting by installing {aor}, which is a neat R package with some useful functions to help you with Advent of Code, so you can focus on actually solving the puzzles.
# install.packages("devtools")
::install_github("clente/aor") devtools
Once installed and with the right cookie configurations (all explained in aor
’s readme), I can simply run
> aor::day_start("2023-12-01", "aoc2023/")
✔ Fetched puzzle.
✔ Fetched input./01_trebuchet
✔ Created directory aoc20231 to aoc2023/01_trebuchet/puzzle.R
✔ Wrote part /01_trebuchet/input.txt
✔ Wrote input to aoc20232, run `aor::day_continue("2023-12-01", "aoc2023/01_trebuchet/puzzle.R")` ℹ To fetch part
…and I’ll have a directory for that day’s puzzle, with a template for the code and the input text! It’s important to note that you must be logged in to get the puzzle input, as they are different across users.
Next step is solving the puzzles. I’m starting with day one.
Day 1: Trebuchet?!
To sum things up, the first part of day 1 is:
- Read a file with text
- Identify the first and last numbers on each line (the “calibration”) and sum them up
Every puzzle comes with a minimal example:
For example:
1abc2
pqr3stu8vwx
a1b2c3d4e5f
treb7uchet
In this example, the calibration values of these four lines are12
,38
,15
, and77
. Adding these together produces142
.
There are many ways this can be done, but I ended up using dplyr
because I’m more used to it. With a little bit of regex, it was easy enough to extract the numbers that I needed to clear part 1.
<- "aoc2023/01_trebuchet/input.txt"
input |>
input ::read_csv(col_names = "input", show_col_types = FALSE) |>
readr::filter(input != "") |>
dplyr::mutate(
dplyrfirst = stringr::str_extract(input, "[0-9]"),
last = stringr::str_extract(input, "[0-9](?!.*[0-9])"),
calibration = as.numeric(paste0(first, last))
|>
) ::pull(calibration) |>
dplyrsum()
Part 2 was a little bit trickier. The puzzle says:
Your calculation isn’t quite right. It looks like some of the digits are actually spelled out with letters:
one
,two
,three
,four
,five
,six
,seven
,eight
, andnine
also count as valid “digits”. Equipped with this new information, you now need to find the real first and last digit on each line. For example:
two1nine
eightwothree
abcone2threexyz
xtwone3four
4nineeightseven2
zoneight234
7pqrstsixteen
In this example, the calibration values are29
,83
,13
,24
,42
,14
, and76
. Adding these together produces281
.
My first idea was using regex to get all ocurrences of numbers and spelled out numbers. So my regex would look something like
<- ("[0-9]|one|two|three|four|five|six|seven|eight|nine") rx
Then, I could switch the spelled out numbers, paste the first and last ones and sum them up.
<- function(num) {
switch_numbers if (stringr::str_detect(num, "[a-z]")) {
<- switch(
result one = 1, two = 2, three = 3, four = 4, five = 5, six = 6, seven = 7,
num, eight = 8, nine = 9
)else {
} <- num
result
}as.numeric(result)
}
|>
input ::read_csv(col_names = "input", show_col_types = FALSE) |>
readr::filter(input != "") |>
dplyr::mutate(
dplyrnumbers = stringr::str_extract_all(input, rx),
first = purrr::map_vec(numbers, head, n = 1),
last = purrr::map_vec(numbers, tail, n = 1),
first = purrr::map_vec(first, switch_numbers),
last = purrr::map_vec(last, switch_numbers),
calibration = as.numeric(paste0(first, last))
|>
)::pull(calibration) |>
dplyrsum()
But, turns out I was wrong. You can see the problem with str_extract_all
in this case:
> stringr::str_extract_all("threeight", rx)
1]]
[[1] "three" [
What I actually wanted:
1]]
[[1] "three" [2] "eight" [
The regex I was using does not take into account overlapping!
This takes us to attempt #2, where I try to take this problem into account with stringi
.
<- function(num) {
switch_numbers if (stringr::str_detect(num, "[a-z]")) {
<- switch(
result one = 1, two = 2, three = 3, four = 4, five = 5, six = 6, seven = 7,
num, eight = 8, nine = 9
)else {
} <- num
result
}as.numeric(result)
}
<- paste0("(?=([0-9]|", paste(xfun::n2w(1:9), collapse = "|"), "))")
rx
|>
input ::read_csv(col_names = "input", show_col_types = FALSE) |>
readr::filter(input != "") |>
dplyr::mutate(
dplyrnumbers = stringi::stri_match_all_regex(input, rx),
numbers = purrr::map(numbers, ~magrittr::extract(.x, ,2)),
first = purrr::map_vec(numbers, head, n = 1),
last = purrr::map_vec(numbers, tail, n = 1),
first = purrr::map_vec(first, switch_numbers),
last = purrr::map_vec(last, switch_numbers),
calibration = as.numeric(paste0(first, last))
|>
) ::pull(calibration) |>
dplyrsum()
Using a lookahead in the regex (?=) and stringi::stri_match_all_regex
did the trick and got me to the right answer! 🥳
If you have any other ideas, feel free to tell me more on Mastodon or Bluesky.
Happy coding! <3