Debugging “Maximum String literal length exceeded” with scala

Today I ran into a fascinating bug. We use ficus as a HOCON auto parser for scala. It works great, because parsing configurations into strongly typed case classes is annoying. Ficus works by using a macro to invoke implicitly in scope Reader[T] classes for data types and recursively builds the nested parser.

I went to create a test for a new custom field I added to our config:

class ProductConfigTests extends FlatSpec {
  "Configs" should "be valid in QA" in {
    assert(ConfigLoader.verify(ProductsConfig, Environment.QA).isSuccess)
  }
}

Our config verifier just invokes the hocon parser and makes sure it doesn’t throw an error. ProductsConfig has a lot of fields to it, and I recently added a new one. Suddenly the test broke with the following error:

error] Error while emitting com/services/products/service/tests/ConfigTests
[error] Maximum String literal length exceeded
[error] one error found
[error] (server/test:compileIncremental) Compilation failed
[error] Total time: 359 s, completed May 10, 2018 4:56:02 PM
> test:compile
[info] Compiling 36 Scala sources to /Users/antonkropp/src/products/server/target/scala-2.12/test-classes...
java.lang.IllegalArgumentException: Maximum String literal length exceeded
	at scala.tools.asm.ByteVector.putUTF8(ByteVector.java:213)
	at scala.tools.asm.ClassWriter.newUTF8(ClassWriter.java:1114)
	at scala.tools.asm.ClassWriter.newString(ClassWriter.java:1582)
	at scala.tools.asm.ClassWriter.newConstItem(ClassWriter.java:1064)
	at scala.tools.asm.MethodWriter.visitLdcInsn(MethodWriter.java:1187)
	at scala.tools.asm.tree.LdcInsnNode.accept(LdcInsnNode.java:71)
	at scala.tools.asm.tree.InsnList.accept(InsnList.java:162)
	at scala.tools.asm.tree.MethodNode.accept(MethodNode.java:820)
	at scala.tools.asm.tree.MethodNode.accept(MethodNode.java:730)

Wat?

I fired up sbt -jvm-debug 5005 and attached to the compiler.

I can def see that there is some sort of class being written with a large const string. But why? I’d never seen this before.

I went to another service that has a test exactly like this for its config and used cfr to decompile the generated scala files:

antonkropp at combaticus in ~/src/curalate/queue-batcher/server/target/scala-2.12/test-classes/com/curalate/services/queuebatcher/service/tests (devx/minimize-io-calls)
$ ls
total 1152
-rw-r--r--  1 antonkropp  staff   47987 May 10 11:20 BatchTrackerTests.class
-rw-r--r--  1 antonkropp  staff  145293 May 10 13:34 BitGroupTests.class
-rw-r--r--  1 antonkropp  staff    3112 May  9 13:23 ConfigTests$$anon$1$$anon$2$$anon$3.class
-rw-r--r--  1 antonkropp  staff    3553 May  9 13:23 ConfigTests$$anon$1$$anon$2$$anon$4$$anon$5.class
-rw-r--r--  1 antonkropp  staff    5190 May  9 13:23 ConfigTests$$anon$1$$anon$2$$anon$4.class
-rw-r--r--  1 antonkropp  staff    3627 May  9 13:23 ConfigTests$$anon$1$$anon$2$$anon$6.class
-rw-r--r--  1 antonkropp  staff    3906 May  9 13:23 ConfigTests$$anon$1$$anon$2.class
-rw-r--r--  1 antonkropp  staff    4904 May  9 13:23 ConfigTests$$anon$1$$anon$7.class
-rw-r--r--  1 antonkropp  staff    4598 May  9 13:23 ConfigTests$$anon$1$$anon$8.class
-rw-r--r--  1 antonkropp  staff    5063 May  9 13:23 ConfigTests$$anon$1.class
-rw-r--r--  1 antonkropp  staff    3125 May  9 13:23 ConfigTests$$anon$17$$anon$18$$anon$19.class
-rw-r--r--  1 antonkropp  staff    3573 May  9 13:23 ConfigTests$$anon$17$$anon$18$$anon$20$$anon$21.class
-rw-r--r--  1 antonkropp  staff    5213 May  9 13:23 ConfigTests$$anon$17$$anon$18$$anon$20.class
-rw-r--r--  1 antonkropp  staff    3640 May  9 13:23 ConfigTests$$anon$17$$anon$18$$anon$22.class
-rw-r--r--  1 antonkropp  staff    3924 May  9 13:23 ConfigTests$$anon$17$$anon$18.class
-rw-r--r--  1 antonkropp  staff    4914 May  9 13:23 ConfigTests$$anon$17$$anon$23.class
-rw-r--r--  1 antonkropp  staff    4606 May  9 13:23 ConfigTests$$anon$17$$anon$24.class
-rw-r--r--  1 antonkropp  staff    5073 May  9 13:23 ConfigTests$$anon$17.class
-rw-r--r--  1 antonkropp  staff    3119 May  9 13:23 ConfigTests$$anon$9$$anon$10$$anon$11.class
-rw-r--r--  1 antonkropp  staff    3566 May  9 13:23 ConfigTests$$anon$9$$anon$10$$anon$12$$anon$13.class
-rw-r--r--  1 antonkropp  staff    5205 May  9 13:23 ConfigTests$$anon$9$$anon$10$$anon$12.class
-rw-r--r--  1 antonkropp  staff    3634 May  9 13:23 ConfigTests$$anon$9$$anon$10$$anon$14.class
-rw-r--r--  1 antonkropp  staff    3915 May  9 13:23 ConfigTests$$anon$9$$anon$10.class
-rw-r--r--  1 antonkropp  staff    4909 May  9 13:23 ConfigTests$$anon$9$$anon$15.class
-rw-r--r--  1 antonkropp  staff    4601 May  9 13:23 ConfigTests$$anon$9$$anon$16.class
-rw-r--r--  1 antonkropp  staff    5066 May  9 13:23 ConfigTests$$anon$9.class
-rw-r--r--  1 antonkropp  staff   87180 May  9 13:23 ConfigTests.class
-rw-r--r--  1 antonkropp  staff   69451 May  9 13:23 DbTests.class
-rw-r--r--  1 antonkropp  staff   12985 May  9 13:23 MysqlTests.class
-rw-r--r--  1 antonkropp  staff   68418 May 10 12:40 Tests.class
drwxr-xr-x  4 antonkropp  staff     128 May  9 13:23 db
drwxr-xr-x  9 antonkropp  staff     288 May  9 13:23 modules

$ java -jar ~/tools/cfr.jar ConfigTests.class

1000 lines later, I can see

So something is putting in a large string of the configuration parser compiled into the class file.

I checked the ficus source code and its not it, so it must be something with the test.

Turns out assert is a macro from scalatest:

def assert(condition: Boolean)(implicit prettifier: Prettifier, pos: source.Position): Assertion = macro AssertionsMacro.assert

Where the macro

def assert(context: Context)(condition: context.Expr[Boolean])(prettifier: context.Expr[Prettifier], pos: context.Expr[source.Position]): context.Expr[Assertion] =
    new BooleanMacro[context.type](context, "assertionsHelper").genMacro[Assertion](condition, "macroAssert", context.literal(""), prettifier, pos)

Is looking for an implicit position.

Position is from scalactic which comes with scalatest

case class Position(fileName: String, filePathname: String, lineNumber: Int)

/**
 * Companion object for <code>Position</code> that defines an implicit
 * method that uses a macro to grab the enclosing position.
 */
object Position {

  import scala.language.experimental.macros

  /**
   * Implicit method, implemented with a macro, that returns the enclosing
   * source position where it is invoked.
   *
   * @return the enclosing source position
   */
  implicit def here: Position = macro PositionMacro.genPosition
}

And here we can ascertain that the macro expansion of the ficus config parser is being captured by the position file macro and auto compiled into the assert statement!

Changing the test to be

class ProductConfigTests extends FlatSpec {
  "Configs" should "be valid in QA" in {
    validate(ConfigLoader.verify(ProductsConfig, Environment.QA).isSuccess)
  }

  /**
   * This validate function needs to exist because this bug is amazing.
   *
   * `assert` is a macro from scalatest that automatically compiles the contextual source tree
   * into the assert, so that you can get line number and metadata context if the line fails.
   *
   * The ficus macro expander for ProductConfig is larger than 65k characters, which is normally fine
   * for code, however since scalatest tries to compile this > 65k anonymous class tree as a _string_
   * it breaks the java compiler!
   *
   * By breaking the function scope and having the macro create a closure around the _validate_ block
   * it no longer violates the 65k static string constraint
   */
  private def validate(block: => Boolean): Unit = {
    assert(block)
  }
}

Now makes the test pass. What a day.

Post a comment

You may use the following HTML:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>