Tuesday, May 22, 2018

JEP 329 and JEP 330 Proposed for JDK 11

This past week, two Mark Reinhold messages (here and here) on the OpenJDK jdk-dev mailing list proposed two new JEPs for inclusion with JDK 11: JEP 329 ["ChaCha20 and Poly1305 Cryptographic Algorithms"] and JEP 330 ["Launch Single-File Source-Code Programs"]. I am excited about JEP 330, but that enthusiasm led me to blog on it when it was but a mere "draft" JEP (not even assigned to the 330 number at that point). The focus of the remainder of this post will therefore be on JEP 329.

The intent of JEP 329 is succinctly described in the JEP's "Summary" section: "Implement the ChaCha20 and ChaCha20-Poly1305 ciphers as specified in RFC 7539." That same "Summary" section also states, "ChaCha20 is a relatively new stream cipher that can replace the older, insecure RC4 stream cipher."

The RC4 (Rivest Cipher 4) stream cipher has already been disabled in major web browsers (early 2016) due to security risks:

The "Motivation" section of JEP 329 currently states:

The only other widely adopted stream cipher, RC4, has long been deemed insecure. The industry consensus is that ChaCha20-Poly1305 is secure at this point in time, and it has seen fairly wide adoption across TLS implementations as well as in other cryptographic protocols. The JDK needs to be on par with other cryptographic toolkits and TLS implementations.

It is worth noting this important caveat mentioned in JEP 329's "Non-Goals" section: "TLS cipher suite support will not be part of this JEP. TLS support for these ciphers will be part of a follow-on enhancement." For additional details, see JDK-8140466 : ChaCha20 and Poly1305 Cipher Suites.

The "Dependencies" section of JEP 329 states that its only dependency is on the "constant-time math APIs" embodied in JEP 324 (see my previous post for additional overview details).

JDK-8198925 : ChaCha20 and ChaCha20-Poly1305 Cipher Implementations provides additional and even lower-level details than JEP 329. For example, it provides the specification of the new class javax.crypto.spec.ChaCha20ParameterSpec and its methods.

As of this writing, there are currently 8 JEPs targeted for JDK 11 and the 2 additional JEPs highlighted in this post's title are now proposed to target JDK 11, bringing the total number of JEPs targeted or likely to be targeted to JDK 11 to ten.

Monday, May 21, 2018

New JDK 11 Files Methods for Reading/Writing Strings From/To Files

My previous post focused on the Files.isSameContent() method that is likely to be added to JDK 11. JDK-8201276 ["(fs) Add methods to Files for reading/writing a string from/to a file"] mentions this new method and also highlights the methods that are the subjects of this post:

  • readString(Path)
  • readString(Path, Charset)
  • writeString(Path, CharSequence, OpenOption...)
  • writeString(Path, CharSequence, Charset, OpenOption...)

Joe Wang recently posted the message "RFR (JDK11/NIO) 8201276: (fs) Add methods to Files for reading/writing a string from/to a file" on the core-libs-dev mailing list. In this message, Wang provided links to the related bug (JDK-8201276), to the proposed specification (API) differences, and to the proposed code changes.

This is another case where a proposed change leads to some interesting discussion. The thread started with this message includes discussion regarding whether or not to include operating-system specific line separators Files.readString in methods implementations, discussion of alternatives of the readString methods such as simple Files.lines().collect(Collectors.joining("\n")), explanation of how raw string literals handle line separators, a described example of a common use case for these methods, and use of File.deleteOnExit() with unit tests.

JDK-8201276 shows that the proposed methods to implement "common operations" for "reading the content of a file into a string and writing a string text to a file" are currently planned for JDK 11.

New JDK 11 Files Method isSameContent()

It has been proposed that a method named isSameContents() be added to the Files class in JDK 11 via JDK-8202285 ["(fs) Add a method to Files for comparing file contents"]. Proposed by Joe Wang, this new method is "intended to be an extension to the existing isSameFile method since it stopped short of comparing the content to answer the query for whether two files are equal." JDK-8201276 also references this method and describes it as "a utility method that compares two files."

Regarding the usage of this new method, JDK-8202285's Description states:

Proposing a new Files method isSameContent. Files currently has a method called isSameFile that answers the query on whether or not two files are the same file. Since two files containing the same contents may also be viewed as the same, it is desirable to add a method that further compares the contents, that would make the "is same file" query complete.

The OpenJDK core-libs-dev mailing list discussion on this thread provides additional details on the background of, motivation for, and implementation of this new method. For example, there are messages on this thread that do the following:

A particularly insightful message in this thread is a Rémi Forax message providing code demonstrating how to use the JDK 9-added InputStream.transfer(OutputStream) method, the JDK 10-added local variable type inference, and classes MessageDigest and DigestOutputStream to hash the contents of a file in six lines of Java code.

It's looking increasingly likely that JDK 11 will provide several new useful "utility" methods in addition to the JEPs and other more significant features that will be coming with JDK 11.

Saturday, May 19, 2018

Predicate::not Coming to Java

Jim Laskey's recent message "RFR: CSR - JDK-8203428 Predicate::not" on the OpenJDK core-libs-dev mailing list calls out JDK Bug JDK-8203428 ["Predicate::not"]. The "Summary" of JDK-8203428 states, "Introduce a new static method Predicate::not which will allow developers to negate predicate lambdas trivially." It is currently assigned to JDK 11.

The "Problem" section of JDK-8203428 provides a succinct description of the issue that Predicate::not addresses:

The requirement for predicate negation occurs frequently since predicates are defined antipodal to a positive selection; isNull, isEmpty, isBlank.

Presently there is no easy way to negate a predicate lambda without first wrapping in a Predicate Object.

There is a highly illustrative example of how this would work in the JDK-8203428 write-up. The "Problem" section of JDK-8203428 provides code that demonstrates how "predicate negation" would be performed today and the "Solution" section provides code demonstrating how the same functionality could be implemented with the proposed static method Predicate::not.

There are some other interesting messages in this mailing list thread. A Brian Goetz message in the thread states that "we did discover that default methods on [functional interfaces] combined with subtyping of [functional interfaces] caused trouble. But static methods are fine." A Rémi Forax message in the thread states that "stackoverflow has already decided that Predicate.not was the right method." A Sundararajan Athijegannathan message in the thread points out that "not(String::isEmpty) reads almost like !str.isEmpty()".

A message on a different thread on the core-libs-dev mailing list adds this additional context regarding the motivation behind the addition of Predicate.not(): "One of the rationals for adding 'Predicate.not' is to avoid adding another stream operation" because "we are trying to avoid adding 'overloaded' forms of the same kind of operation." This Paul Sandoz message provides the example that one would use the general filter(not(String::isEmpty)) rather than a specific Filter.reject or Filter.notFilter operation.

The addition of static function not(Predicate<T>) to Predicate is a small thing, but should improve the fluency of many lines of Java code.

Monday, May 14, 2018

Three New JEPs Targeted for JDK 11

Three new JEPs were targeted for JDK 11 a week ago today (7 May 2014). In three separate messages on the jdk-dev mailing list, Mark Reinhold made the following announcements:

JEP 324: Key Agreement with Curve25519 and Curve448

The "Summary" section of JEP 324 ("Key Agreement with Curve25519 and Curve448") states, "Implement key agreement using Curve25519 and Curve448 as described in RFC 7748." The Curve25519 entry on Wikipedia has an opening paragraph that makes it clear why this particular elliptic curve is well-suited as an addition to the JDK. It states that "Curve25519 is an elliptic curve offering 128 bits of security" that is "designed for use with the elliptic curve Diffie–Hellman (ECDH) key agreement scheme and is one of the fastest ECC curves and is not covered by any known patents." It adds that "the reference implementation is public domain software."

D. J. Bernstein provides a more specific and approachable summary of the value of Curve25519: "Given a user's 32-byte secret key, Curve25519 computes the user's 32-byte public key. Given the user's 32-byte secret key and another user's 32-byte public key, Curve25519 computes a 32-byte secret shared by the two users. This secret can then be used to authenticate and encrypt messages between the two users."

RFC 7748 ("Elliptic Curves for Security") is a memo provided by the Internet Research Task Force (IRTF) that "specifies two elliptic curves [curve25519 and curve448] over prime fields that offer a high level of practical security in cryptographic applications, including Transport Layer Security (TLS), and that are "intended to operate at the ~128-bit and ~224-bit security level, respectively, and are generated deterministically based on a list of required properties."

The "primary goal" of JEP 324 is to provided "an API and an implementation for [the RFC 7748] standard," the two additional goals are also spelled out. One of the additional goals is to provide "a platform-independent, all-Java implementation with better performance than the existing ECC (native C) code at the same security strength."

Current plans for JEP 324 only involve an RFC implementation in the SunEC elliptic curve cryptography provider. However, with an API provided ("XDH"), it seems possible for other ECC providers to implement RFC 7748 as desired. JEP 324 currently adds this important note: "This new library will be in an internal JDK package, and will only be used by new crypto algorithms."

JEP 327: Unicode 10

JEP 327 ("Unicode 10") provides a straightforward "Summary": "Upgrade existing platform APIs to support version 10.0 of the Unicode Standard." The WikiBooks entry "Unicode/Versions" lists each major version of Unicode from Unicode 1.0 through Unicode 12.0. Unicode 10 was released last summary and Unicode 11 is planned for next month.

An early history/mapping of Java versions to Unicode versions is provided in "Unicode Versions Supported in Java-History," which shows Unicode 1.1.5 associated with JDK 1.0 through Unicode 6.0 associated with JDK 7. Unicode 6.2.0 was supported by JDK 8, Unicode 8.0.0 was supported by JDK 9 and JDK 10, so JDK 11 will add support for both Unicode 9.0 and Unicode 10.0.

JEP 327 explicitly lists "four related Unicode specifications" (Unicode Technical Standards) that "will not be implemented" as part of JEP 327. These are UTS #10 ("Unicode Collation Algorithm"), UTS #39 ("Unicode Security Mechanisms"), UTS #46 ("Unicode IDNA Compatibility Processing"), and UTS #51 ("Unicode Emoji").

JEP 328: Flight Recorder

JEP 328 ("Flight Recorder") aims to "provide a low-overhead data collection framework for troubleshooting Java applications and the HotSpot JVM." In its "Motivation" section, this JEP states, "Flight Recorder records events originating from applications, the JVM and the OS. Events are stored in a single file that can be attached to bug reports and examined by support engineers, allowing after-the-fact analysis of issues in the period leading up to a problem. Tools can use an API to extract information from recording files."

The current Java Mission Control page describes Java Flight Recorder and its relationship to Java Mission Control:

Java Flight Recorder and Java Mission Control together create a complete tool chain to continuously collect low level and detailed runtime information enabling after-the-fact incident analysis. Java Flight Recorder is a profiling and event collection framework built into the Oracle JDK. It allows Java administrators and developers to gather detailed low level information about how the Java Virtual Machine (JVM) and the Java application are behaving. Java Mission Control is an advanced set of tools that enables efficient and detailed analysis of the extensive of data collected by Java Flight Recorder. The tool chain enables developers and administrators to collect and analyze data from Java applications running locally or deployed in production environments.

This JEP is another step in achieving the announcement made last September that "Oracle will also open source commercial features such as Java Flight Recorder previously only available in the Oracle JDK." Marcus Hirt announced just this month that Java Mission Control has been open sourced with repositories available at http://hg.openjdk.java.net/jmc.

Summary

The three JEPs highlighted in this post now bring the number of JEPs currently associated with JDK 11 to a total of eight:

Saturday, May 12, 2018

Java's @Serial Annotation

The JDK may be getting another standard (predefined) annotation with JDK 11: @Serial. JDK-8202385 ["Annotation to mark serial-related fields and methods"] aims to add "some kind of 'SerialRelated' annotation to facilitate automated checking of the declarations of serial fields and methods." The idea is to better indicate to a developer when a serialization-related field or method is misspelled similar to the way that "the java.lang.Override annotation type is used to signal the compiler should verify the method is in fact overridden."

Joe Darcy recently requested review of the "webrev" (proposed code addition). This provides a peek at what the new @Serial might look like. The current proposal is for this annotation definition to reside in the java.io package, to be targeted at specific methods or fields, and to have SOURCE retention.

The Javadoc comments for the proposed definition of @Serial currently provide significant detail on how to use this annotation. This Javadoc also explicitly specifies which methods and fields are anticipated to be annotated with @Serial: writeObject(), readObject(), readObjectNoData(), writeReplace(), readResolve(), ObjectStreamField[], and serialVersionUID.

The proposed @Serial annotation will be checked when the javac "serial" lint check is executed. This is described in Darcy's e-mail request for review:

The proposed java.io.Serial annotation type is intended to be used along with an augmented implementation of javac's "serial" lint check; that work will be done separately as part of JDK-8202056: "Expand serial warning to check for bad overloads of serial-related methods".

It's interesting to note that the name of this annotation is not necessarily finalized, though it seems likely to stick. Darcy's e-mail message points out that alternate names such as @Serialize and @SerialRelated could also be used.

An interesting distinction is that the @Serial annotation cannot or should not be used with certain methods and certain fields of the Externalizable interface (extends Serializable) because those methods and fields are not used in Externalizable. More details on this distinction are available in the core-libs-dev messages 053060, 053061, 053064, and 053067.

The @Serial annotation is not officially scheduled for JDK 11 as of this writing, but it appears likely that it could be available in time for the JDK 11 release given the recent progress of JDK-8202385. Besides the potential usefulness of this annotation to those implementing custom serialization, this annotation's definition will provide another example of how any custom annotation can be documented to allow it to be used correctly.

Tuesday, May 1, 2018

New Methods on Java String with JDK 11

It appears likely that Java's String class will be gaining some new methods with JDK 11, expected to be released in September 2018.

BUG #BUG TITLENEW String METHODDESCRIPTION
JDK-8200425 String::lines lines() "String instance method that uses a specialized Spliterator to lazily provide lines from the source string."
JDK-8200378 String::strip, String::stripLeading, String::stripTrailing strip() "Unicode-aware" evolution of trim()
stripLeading() "removal of Unicode white space from the beginning"
stripTrailing() "removal of Unicode white space from the ... end"
JDK-8200437 String::isBlank isBlank() "instance method that returns true if the string is empty or contains only white space"

Evidence of the progress that has been made related to these methods can be found in messages requesting "compatibility and specification reviews" (CSR) on the core-libs-dev mailing list:

A common characteristic of four of these five new methods is that they use a different (newer) definition of "whitespace" than did old methods such as String.trim(). Bug JDK-8200373 ["String::trim JavaDoc should clarify meaning of space"] even addresses this for the String.trim() method (mailing list review request):

The current JavaDoc for String::trim does not make it clear which definition of "space" is being used in the code. With additional trimming methods coming in the near future that use a different definition of space, clarification is imperative. String::trim uses the definition of space as any codepoint that is less than or equal to the space character codepoint (\u0040.) Newer trimming methods will use the definition of (white) space as any codepoint that returns true when passed to the Character::isWhitespace predicate.

The method isWhitespace(char) was added to Character with JDK 1.1, but the method isWhitespace(int) was not introduced to the Character class until JDK 1.5. The latter method (the one accepting a parameter of type int) was added to support supplementary characters. The Javadoc comments for the Character class define supplementary characters (typically modeled with int-based "code point") versus BMP characters (typically modeled with single character):

The set of characters from U+0000 to U+FFFF is sometimes referred to as the Basic Multilingual Plane (BMP). Characters whose code points are greater than U+FFFF are called supplementary characters. The Java platform uses the UTF-16 representation in char arrays and in the String and StringBuffer classes. In this representation, supplementary characters are represented as a pair of char values ... A char value, therefore, represents Basic Multilingual Plane (BMP) code points, including the surrogate code points, or code units of the UTF-16 encoding. An int value represents all Unicode code points, including supplementary code points. ... The methods that only accept a char value cannot support supplementary characters. ... The methods that accept an int value support all Unicode characters, including supplementary characters.

I added the bold emphasis in the above quote to emphasize the significance of a "code point," which is defined for the Java context as "a value that can be used in a coded character set". Four of the five proposed new methods for String in JDK 11 rely heavily on the concept embodied in Character.isWhitespace(int) to determine how to "trim" a given string or when determining if a given string is "blank."

Speaking of Unicode, JEP 327 ["Unicode 10"] has been proposed to be added to JDK 11 as well. As that JEP states, its intent is to "upgrade existing platform APIs to support version 10.0 of the Unicode Standard."

Conclusion

The new methods on String currently proposed for JDK 11 provide a more consistent approach to handling white space in strings that can better handle internationalization, provide methods for trimming whitespace only at the beginning of the string or at the end of the string, and provide a method especially intended for coming raw string literals.

Additional References