Dao De Code

Pattern Matching on Duration Regex and Non Capturing Groups

Last week I was writing a method to parse a String representing throttle policy in somewhat plain English. The input string can be
2 requests per 10 minutes or 1 request per 100 seconds or 12 per 1 day. So the idea is that request or requests is optional and is there only to make it more human readable. Time unit string supposed to be a valid one for Scala's Duration class to be able to parse it. Output is captured in following case class:

case class ThrottlePolicy(requests: Int, per: FiniteDuration)  

So right away I decided it can be nicely and easily done with regex and patter matching. At first iteration I came with this regex and this code:

object ThrottlePolicy {  
      private val Pattern = raw"\s*(\d+)(\s+requests?)?\s+per\s+(\d+)\s+([a-z]+)\s*".r
      def apply(s: String): ThrottlePolicy = s match {
        case Pattern(requests, _, time, timeUnit) =>
          ThrottlePolicy(requests.toInt, Duration(time.toLong, timeUnit))
        case _ => throw new IllegalArgumentException("boo")
      }
    }
  }

Looks simple and works for valid input strings like
1 request per 2 days and even 100 per 1 minute. But it badly fails when timeUnit is not valid. For example, for ThrottlePolicy("1 request per 1 daysy") - it throws a java.util.NoSuchElementException: key not found: daysy instead of my beautifully meaningful IllegalArgumentException("boo"). Plus that underscore in case Pattern(requests, _, time, timUnit) looks ugly.

So first thing we can solve with... right, more pattern matching! Luckily Duration also has an extractor, so I got this:

object ThrottlePolicy {  
      private val Pattern = raw"\s*(\d+)(\s+requests?)?\s+per\s+(\d+\s+[a-z]+)\s*".r
      def apply(s: String): ThrottlePolicy = s match {
        case Pattern(requests, _, Duration(time, timeUnit)) =>
          ThrottlePolicy(requests.toInt, Duration(time, timeUnit))
        case _ => throw new IllegalArgumentException("boo")
      }
    }
  }

Notice how it also removed toLong call at Duration construction! In fact we can simplify our regex a bit and handle parsing valid duration to Duration's extractor:

private val Pattern = raw"\s*(\d+)(\s+requests?)?\s+per(.*)".r

And now what about that _ i don't like? It represents a group which I use just to specify optional word and which I don't care about (that's why I have _ there). So I just need to ignore that group somehow.

And then, thanks to my excellent google-driven-development skills, I discovered such a nice thing as a non-capturing group. And my code looks slightly better now:

object ThrottlePolicy {  
      private val Pattern = raw"\s*(\d+)(?:\s+requests?)?\s+per(.*)".r
      def apply(s: String): ThrottlePolicy = s match {
        case Pattern(requests, Duration(time, timeUnit)) =>
          ThrottlePolicy(requests.toInt, Duration(time, timeUnit))
        case _ => throw new IllegalArgumentException("boo")
      }
    }
  } 
comments powered by Disqus