Advent of Code 2020 Day 2

This month I am doing to do the Advent of Code, which is a yearly advent calendar of 25 small programming puzzles

For more information please see the first post in the series.

In this challenge we have to parse data. There are multiple ways to do this and programming languages do have features that support parsing of for example numeric data. They however typically lack a method to know where the data ended without going over the data multiple times.1 The other ways to parse such data is to use a complex parsing framework (which is likely slower for data of this complexity) or by rolling your own parser from scratch. We are choosing the latter here.

Format of the Policy Data

The format of the policy data is shown on the puzzle page, and basically follows the regex (\d+)-(\d+) ([a-z]): ([a-z]+). The groups match only the data of importance. So how can we write a fast parser for it?

First we observe that the way humans write numbers is in big endian. The first digit is the “most signifficant”, which makes iteratively parsing a string of (integer) digits easy:

fn parse_number(data: &str) -> Option<(usize, &str)> {
  let mut val = 0;
  let mut pos = 0;
  for c in data.chars() {
    if !c.is_ascii_digit() {
      break;
    }
    let digit = char_to_digit(c);
    val *= 10;
    val += digit;
    pos += c.len_utf8();
  }
  if pos == 0 {
    None
  } else {
    Some((val, &data[pos..]))
  }
}

One piece of the puzzle that is missing: how do we quickly convert a char to a digit? Taking a look at the ASCII encoding it reveals that the digits are between 0x30 and 0x39 (inclusive), meaning that assuming the character is in that range, we simply need to subtract 0x30. We can rely on this because is_ascii_digit already checks that the character is one of the 10 ASCII digits.

fn char_to_digit(c: char) -> usize {
  (c as u32) as usize - 0x30
}

Parsing the remainder is simple, right? The character group can be matched with

fn parse_char(data: &str) -> Option<(char, &str)> {
  let c = data.chars().next()?;
  if !c.is_ascii_lowercase() {
    return None;
  }
  Some((c, &data[c.len_utf8()..]))
}

the string with

fn parse_string(data: &str) -> Option<(&str, &str)> {
  let mut off = 0;
  for c in data.chars() {
    if !c.is_ascii_lowercase() {
      break;
    }
    off += c.len_utf8();
  }
  if off == 0 {
    None
  } else {
    Some((&data[..off], &data[off..]))
  }
}

and literal characters with

fn match_character(data: &str, c: char) -> Option<&str> {
  if data.chars().next()? != c {
    return None;
  }
  Some(&data[c.len_utf8()..])
}

Let's combine them together to parse a line!

fn parse_line(data: &str) -> Option<(&str, usize, usize, char, &str)> {
  let (from, data) = parse_number(data)?;
  let data = match_character(data, '-')?;
  let (to, data) = parse_number(data)?;
  let data = match_character(data, ' ')?;
  let (expected_char, data) = parse_char(data)?;
  let data = match_character(data, ':')?;
  let data = match_character(data, ' ')?;
  let (password, data) = parse_string(data)?;
  let data = match_character(data, '\n')?;
  Some((data, from, to, expected_char, password))
}

We can parse the entire data with:

fn parse_data(mut data: &str) -> Vec<(usize, usize, char, &str)> {
  let mut vec = Vec::new();
  loop {
    let line = parse_line(data);
    if line.is_none() {
      break;
    }
    let line = line.unwrap();
    vec.push((line.1, line.2, line.3, line.4));
    data = line.0;
  }  
  vec
}

Puzzle 1

We have to check if the string part contains somewhere between n and m instances of a character. We can do this with:

pub fn part1() -> usize {
  parse_data(DATA).into_iter().filter(|(min, max, c, s)| {
    let count = s.chars().filter(|sc| sc == c).count();
    *min <= count && *max >= count
  }).count()
}

This code runs in 44.096μs. Can we do better?

Using with_capacity

From yesterday we learned that there is a with_capacity function to preallocate data. This exists for Vec too, so let's preallocate the 2000 elements:

  let mut vec = Vec::with_capacity(2000);

The code runs in 37.659μs, so 15% faster. Not bad, but do we need to allocate memory and immediately consume it? what if we created an iterator impl?

Using iterators instead of a vector

An iterator impl is an implementation of the iterator trait. It looks something like this:

pub trait Iterator {
  type Item;
  fn next(&mut self) -> Option<Self::Item>;
}

The state of such an iterator is however bound to a (currently unnamed) lifetime. In order to use it we need to add a lifetime to the iterator type

struct PolicyIterator<'a> {
  data: &'a str
}

and an iterator implementation of

impl<'a> Iterator for PolicyIterator<'a> {
  type Item = (usize, usize, char, &'a str);
  fn next(&mut self) -> Option<Self::Item> {
    let line = parse_line(self.data)?;
    self.data = line.0;
    Some((line.1, line.2, line.3, line.4))
  }
}

and to use it, we need to replace the call to parse_data(DATA).into_iterator() with

  PolicyIterator { data: DATA }

This now runs in 37.298μs. Not much faster. Why didn't the speed increase? Maybe we can safely remove some things from our code that would speed it up.

Removing unnecessary checks

While our data is entirely comprised of lowercase ascii characters, do we really need such a check? Let's remove the lowercase ascii check from the first and replace the second one for a check for not newline.

  if c == '\n' {
    break;
  }

And while we are add it, remove the sanity checks that specific characters are colons or spaces or newlines. It will accept more input than it should but that is not that big of a problem in this case.

Similarly we are moving the pointer forward by a potentially variable amount of bytes, even though we know that the digits for example only contain 1 byte characters.

All these changes however yielded in no speed improvement

Inlining

While the compiler already attempts to do inlining in release mode, we can be more explicit about it and manually inline all of these small functions. Removing all of the parts of the code that are now redundant leads to this:

#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub struct PolicyIterator<T>
where T: Iterator<Item = char> + ?Sized
{
  it: T
}

impl<T> Iterator for PolicyIterator<T>
where T: Iterator<Item = char> + ?Sized
{
  type Item = bool;
  fn next(&mut self) -> Option<Self::Item> {
    let mut minimum = 0;
    loop {
      let c = self.it.next()?;
      if !c.is_ascii_digit() {
        break;
      }
      minimum *= 10;
      minimum += (c as u32) - 0x30;
    }
    let mut maximum = 0;
    loop {
      let c = self.it.next()?;
      if !c.is_ascii_digit() {
        break;
      }
      maximum *= 10;
      maximum += (c as u32) - 0x30;
    }
    let expected = self.it.next()?;
    self.it.next()?;
    self.it.next()?;
    let mut counted = 0;
    loop {
      let c = self.it.next()?;
      if c == '\n' {
        break;
      }
      if c == expected {
        counted += 1;
      }
    }
    Some(counted >= minimum && counted <= maximum)
  }
}

pub fn part1() -> usize {
  PolicyIterator {
    it: DATA.chars()
  }.filter(|c| *c).count()
}

This code now only takes 18.324μs, or over 50% faster!

Switching to bytes

Rust uses unicode strings by default. This is what you would want >95% of the time, but in this case the input file is purely ASCII text. Unlike UTF-8 ASCII is a fixed-width encoding that covers the english alphabet and a bunch of special characters widely used in programming. We can loop over bytes instead by just replacing the trait bound to expect u8 and replacing DATA.chars() with DATA.bytes() and putting a b before the '\n' char literal.

The code runs in 11.943μs, or another 34.285% faster.

Puzzle 2

The code for puzzle 2 is almost the same, however instead of a certain character being required a certain amount of times, this one requires that a character is present exactly once in two spots. I chose to go with a branchless way here, by simply XORing a state variable with (position == posa || position == posb) && (ch == expected), like so:

    let mut i = 0;
    let mut valid = false;
    loop {
      let c = self.it.next()?;
      if c == b'\n' {
        break;
      }
      valid ^= (i == posa || i == posb) && (c == expected);
      i += 1;
    }

This code runs in 16.998μs (compared to 18.804μs you would get from having an if statement).

Footnotes

1

Ironically, C actually does not suffer from this problem with the strto(u)l(l) functions.