Protocol Buffers v3.0.0-alpha-2
Pre-releaseVersion 3.0.0-alpha-2 (C++/Java/Python/Ruby/JavaNano)
General
-
Introduced Protocol Buffers language version 3 (aka proto3).
When protobuf was initially opensourced it implemented Protocol Buffers
language version 2 (aka proto2), which is why the version number
started from v2.0.0. From v3.0.0, a new language version (proto3) is
introduced while the old version (proto2) will continue to be supported.The main intent of introducing proto3 is to clean up protobuf before
pushing the language as the foundation of Google's new API platform.
In proto3, the language is simplified, both for ease of use and to
make it available in a wider range of programming languages. At the
same time a few features are added to better support common idioms
found in APIs.The following are the main new features in language version 3:
- Removal of field presence logic for primitive value fields, removal
of required fields, and removal of default values. This makes proto3
significantly easier to implement with open struct representations,
as in languages like Android Java, Objective C, or Go. - Removal of unknown fields.
- Removal of extensions, which are instead replaced by a new standard
type called Any. - Fix semantics for unknown enum values.
- Addition of maps.
- Addition of a small set of standard types for representation of time,
dynamic data, etc. - A well-defined encoding in JSON as an alternative to binary proto
encoding.
This release (v3.0.0-alpha-2) includes partial proto3 support for C++,
Java, Python, Ruby and JavaNano. Items 6 (well-known types) and 7
(JSON format) in the above feature list are not implemented.A new notion "syntax" is introduced to specify whether a .proto file
uses proto2 or proto3:// foo.proto syntax = "proto3"; message Bar {...}
If omitted, the protocol compiler will generate a warning and "proto2" will
be used as the default. This warning will be turned into an error in a
future release.We recommend that new Protocol Buffers users use proto3. However, we do not
generally recommend that existing users migrate from proto2 from proto3 due
to API incompatibility, and we will continue to support proto2 for a long
time. - Removal of field presence logic for primitive value fields, removal
-
Added support for map fields (implemented in proto2 and proto3 C++/Java/JavaNano and proto3 Ruby).
Map fields can be declared using the following syntax:
message Foo { map<string, string> values = 1; }
Data of a map field will be stored in memory as an unordered map and it
can be accessed through generated accessors.
C++
-
Added arena allocation support (for both proto2 and proto3).
Profiling shows memory allocation and deallocation constitutes a significant
fraction of CPU-time spent in protobuf code and arena allocation is a
technique introduced to reduce this cost. With arena allocation, new
objects will be allocated from a large piece of preallocated memory and
deallocation of these objects is almost free. Early adoption shows 20% to
50% improvement in some Google binaries.To enable arena support, add the following option to your .proto file:
option cc_enable_arenas = true;
Protocol compiler will generate additional code to make the generated
message classes work with arenas. This does not change the existing API
of protobuf messages and does not affect wire format. Your existing code
should continue to work after adding this option. In the future we will
make this option enabled by default.To actually take advantage of arena allocation, you need to use the arena
APIs when creating messages. A quick example of using the arena API:{ google::protobuf::Arena arena; // Allocate a protobuf message in the arena. MyMessage* message = Arena::CreateMessage<MyMessage>(&arena); // All submessages will be allocated in the same arena. if (!message->ParseFromString(data)) { // Deal with malformed input data. } // Must not delete the message here. It will be deleted automatically // when the arena is destroyed. }
Currently arena does not work with map fields. Enabling arena in a .proto
file containing map fields will result in compile errors in the generated
code. This will be addressed in a future release.
Python
- Python has received several updates, most notably support for proto3
semantics in any .proto file that declares syntax="proto3".
Messages declared in proto3 files no longer represent field presence
for scalar fields (number, enums, booleans, or strings). You can
no longer call HasField() for such fields, and they are serialized
based on whether they have a non-zero/empty/false value. - One other notable change is in the C++-accelerated implementation.
Descriptor objects (which describe the protobuf schema and allow
reflection over it) are no longer duplicated between the Python
and C++ layers. The Python descriptors are now simple wrappers
around the C++ descriptors. This change should significantly
reduce the memory usage of programs that use a lot of message
types.
Ruby
-
We have added proto3 support for Ruby via a native C extension.
The Ruby extension itself is included in the ruby/ directory, and details on
building and installing the extension are in ruby/README.md. The extension
will also be published as a Ruby gem. Code generator support is included as
part ofprotoc
with the--ruby_out
flag.The Ruby extension implements a user-friendly DSL to define message types
(also generated by the code generator from.proto
files). Once a message
type is defined, the user may create instances of the message that behave in
ways idiomatic to Ruby. For example:- Message fields are present as ordinary Ruby properties (getter method
foo
and setter methodfoo=
). - Repeated field elements are stored in a container that acts like a native
Ruby array, and map elements are stored in a container that acts like a
native Ruby hashmap. - The usual well-known methods, such as
#to_s
,#dup
, and the like, are
present.
Unlike several existing third-party Ruby extensions for protobuf, this
extension is built on a "strongly-typed" philosophy: message fields and
array/map containers will throw exceptions eagerly when values of the
incorrect type are inserted.See ruby/README.md for details.
- Message fields are present as ordinary Ruby properties (getter method
JavaNano
-
JavaNano is a special code generator and runtime library designed especially
for resource-restricted systems, like Android. It is very resource-friendly
in both the amount of code and the runtime overhead. Here is an an overview
of JavaNano features compared with the official Java protobuf:- No descriptors or message builders.
- All messages are mutable; fields are public Java fields.
- For optional fields only, encapsulation behind setter/getter/hazzer/
clearer functions is opt-in, which provide proper 'has' state support. - For proto2, if not opted in, has state (field presence) is not available.
Serialization outputs all fields not equal to their defaults.
The behavior is consistent with proto3 semantics. - Required fields (proto2 only) are always serialized.
- Enum constants are integers; protection against invalid values only
when parsing from the wire. - Enum constants can be generated into container interfaces bearing
the enum's name (so the referencing code is in Java style). - CodedInputByteBufferNano can only take byte[](not InputStream).
- Similarly CodedOutputByteBufferNano can only write to byte[].
- Repeated fields are in arrays, not ArrayList or Vector. Null array
elements are allowed and silently ignored. - Full support for serializing/deserializing repeated packed fields.
- Support extensions (in proto2).
- Unset messages/groups are null, not an immutable empty default
instance. - toByteArray(...) and mergeFrom(...) are now static functions of
MessageNano. - The 'bytes' type translates to the Java type byte[].
See javanano/README.txt for details.