Did you know ... Search Documentation:
protobufs.pl -- Google's Protocol Buffers ("protobufs")
PublicShow source

Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data -- think XML, but smaller, faster, and simpler. You define how you want your data to be structured once. This takes the form of a template that describes the data structure. You use this template to encode and decode your data structure into wire-streams that may be sent-to or read-from your peers. The underlying wire stream is platform independent, lossless, and may be used to interwork with a variety of languages and systems regardless of word size or endianness. Techniques exist to safely extend your data structure without breaking deployed programs that are compiled against the "old" format.

The idea behind Google's Protocol Buffers is that you define your structured messages using a domain-specific language and tool set. Further documentation on this is at https://developers.google.com/protocol-buffers.

There are two ways you can use protobufs in Prolog:

The protobuf_parse_from_codes/3 and protobuf_serialize_to_codes/3 interface translates between a "wire stream" and a Prolog term. This interface takes advantage of SWI-Prolog's dict. There is a protoc plugin (protoc-gen-swipl) that generates a Prolog file of meta-information that captures the .proto file's definition in the protobufs module:

  • proto_meta_normalize(Unnormalized, Normalized)
  • proto_meta_package(Package, FileName, Options)
  • proto_meta_message_type( Fqn, Package, Name)
  • proto_meta_message_type_map_entry( Fqn)
  • proto_meta_field_name( Fqn, FieldNumber, FieldName, FqnName)
  • proto_meta_field_json_name( FqnName, JsonName)
  • proto_meta_field_label( FqnName, LabelRepeatOptional) % 'LABEL_OPTIONAL', 'LABEL_REQUIRED', 'LABEL_REPEATED'
  • proto_meta_field_type( FqnName, Type) % 'TYPE_INT32', 'TYPE_MESSAGE', etc
  • proto_meta_field_type_name( FqnName, TypeName)
  • proto_meta_field_default_value( FqnName, DefaultValue)
  • proto_meta_field_option_packed( FqnName)
  • proto_meta_enum_type( FqnName, Fqn, Name)
  • proto_meta_enum_value( FqnName, Name, Number)
  • proto_meta_field_oneof_index( FqnName, Index)
  • proto_meta_oneof( FqnName, Index, Name)

The protobuf_message/2 interface allows you to define your message template as a list of predefined Prolog terms that correspond to production rules in the Definite Clause Grammar (DCG) that realizes the interpreter. Each production rule has an equivalent rule in the protobuf grammar. The process is not unlike specifiying the format of a regular expression. To encode a template to a wire-stream, you pass a grounded template, X, and variable, Y, to protobuf_message/2. To decode a wire-stream, Y, you pass an ungrounded template, X, along with a grounded wire-stream, Y, to protobuf_message/2. The interpreter will unify the unbound variables in the template with values decoded from the wire-stream.

For an overview and tutorial with examples, see library(protobufs): Google's Protocol Buffers Examples of usage may also be found by inspecting test_protobufs.pl and the demo directory, or by looking at the "addressbook" example that is typically installed at /usr/lib/swi-prolog/doc/packages/examples/protobufs/interop/addressbook.pl

author
- Jeffrey Rosenwald (JeffRose@acm.org)
- Peter Ludemann (peter.ludemann@gmail.org)
See also
- https://developers.google.com/protocol-buffers
- https://developers.google.com/protocol-buffers/docs/encoding
Compatibility
- SWI-Prolog
Source protobuf_parse_from_codes(+WireCodes:list(int), +MessageType:atom, -Term) is semidet
Process bytes (list of int) that is the serialized form of a message (designated by MessageType), creating a Prolog term.

Protoc must have been run (with the --swipl_out= option and the resulting top-level _pb.pl file loaded. For more details, see the "protoc" section of the overview documentation.

Fails if the message can't be parsed or if the appropriate meta-data from protoc hasn't been loaded.

All fields that are omitted from the WireCodes are set to their default values (typically the empty string or 0, depending on the type; or [] for repeated groups). There is no way of testing whether a value was specified in WireCodes or given its default value (that is, there is no equivalent of the Python implementation's =HasField`). Optional embedded messages and groups do not have any default value -- you must check their existence by using get_dict/3 or similar. If a field is part of a "oneof" set, then none of the other fields is set. You can determine which field had a value by using get_dict/3.

Arguments:
WireCodes- Wire format of the message from e.g., read_stream_to_codes/2. (The stream should have options encoding(octet) and type(binary), either as options to read_file_to_codes/3 or by calling set_stream/2 on the stream to read_stream_to_codes/2.)
MessageType- Fully qualified message name (from the .proto file's package and message). For example, if the package is google.protobuf and the message is FileDescriptorSet, then you would use '.google.protobuf.FileDescriptorSet' or 'google.protobuf.FileDescriptorSet'. If there's no package name, use e.g.: 'MyMessage or '.MyMessage'. You can see the packages by looking at protobufs:proto_meta_package(Pkg,File,_) and the message names and fields by protobufs:proto_meta_field_name('.google.protobuf.FileDescriptorSet', FieldNumber, FieldName, FqnName) (the initial '.' is not optional for these facts, only for the top-level name given to protobuf_serialize_to_codes/3).
Term- The generated term, as nested dicts.
Errors
- version_error(Module-Version) you need to recompile the Module with a newer version of protoc.
See also
- library(protobufs): Google's Protocol Buffers
bug
- Ignores .proto extensions.
- map fields don't get special treatment (but see protobuf_map_pairs/3).
- Generates fields in a different order from the C++, Python, Java implementations, which use the field number to determine field order whereas currently this implementation uses field name. (This isn't stricly speaking a bug, because it's allowed by the specification; but it might cause some surprise.)
To be done
- document the generated terms (see library(http/json) and json_read_dict/3)
- add options such as true and value_string_as (similar to json_read_dict/3)
- add option for form of the dict tags (fully qualified or not)
- add option for outputting fields in the C++/Python/Java order (by field number rather than by field name).
Source protobuf_serialize_to_codes(+Term:dict, -MessageType:atom, -WireCodes:list(int)) is det
Process a Prolog term into bytes (list of int) that is the serialized form of a message (designated by MessageType).

Protoc must have been run (with the --swipl_out= option and the resulting top-level _pb.pl file loaded. For more details, see the "protoc" section of the overview documentation.

Fails if the term isn't of an appropriate form or if the appropriate meta-data from protoc hasn't been loaded, or if a field name is incorrect (and therefore nothing in the meta-data matches it).

Arguments:
Term- The Prolog form of the data, as nested dicts.
MessageType- Fully qualified message name (from the .proto file's package and message). For example, if the package is google.protobuf and the message is FileDescriptorSet, then you would use '.google.protobuf.FileDescriptorSet' or 'google.protobuf.FileDescriptorSet'. If there's no package name, use e.g.: 'MyMessage or '.MyMessage'. You can see the packages by looking at protobufs:proto_meta_package(Pkg,File,_) and the message names and fields by protobufs:proto_meta_field_name('.google.protobuf.FileDescriptorSet', FieldNumber, FieldName, FqnName) (the initial '.' is not optional for these facts, only for the top-level name given to protobuf_serialize_to_codes/3).
WireCodes- Wire format of the message, which can be output using format('~s', [WireCodes]).
Errors
- version_error(Module-Version) you need to recompile the Module with a newer version of protoc.
- existence_error if a field can't be found in the meta-data
See also
- library(protobufs): Google's Protocol Buffers
bug
- map fields don't get special treatment (but see protobuf_map_pairs/3).
- oneof is not checked for validity.
Source protobuf_var_int(?A:int)// is det[private]
Conversion between an int A and a list of codes, using the "varint" encoding. The behvior is undefined if A is negative. This is a low-level predicate; normally, you should use template_message/2 and the appropriate template term. e.g. phrase(protobuf_var_int(300), S) => S = [172,2] phrase(protobuf_var_int(A), [172,2]) -> A = 300
Source protobuf_tag_type(?Tag:int, ?WireType:atom)// is det[private]
Conversion between Tag (number) + WireType and wirestream codes. This is a low-level predicate; normally, you should use template_message/2 and the appropriate template term. @arg Tag The item's tag (field number) @arg WireType The item's wire type (see prolog_type//2 for how to convert this to a Prolog type)
Source prolog_type(?Tag:int, ?PrologType:atom)// is semidet[private]
Match Tag (field number) + PrologType. When Type is a variable, backtracks through all the possibilities for a given wire encoding. Note that 'repeated' isn't here because it's handled by single_message//3. See also segment_type_tag/3.
 payload(?PrologType, ?Payload) is det[private]
Process the codes into Payload, according to PrologType TODO: payload//2 "mode" is sometimes module-sensitive, sometimes not. payload(enum, A)// has A as a callable all other uses of payload//2, the 2nd arg is not callable.
  • This confuses check/0; it also makes defining an enumeration more difficult because it has to be defined in module protobufs (see vector_demo.pl, which defines commands/2)
Source single_message(+PrologType:atom, ?Tag, ?Payload)// is det[private]
Processes a single messages (e.g., one item in the list in protobuf([...]). The PrologType, Tag, Payload are from Field =.. [PrologType, Tag, Payload] in the caller
Source protobuf_message(?Template, ?WireStream) is semidet
Source protobuf_message(?Template, ?WireStream, ?Rest) is nondet
Marshals and unmarshals byte streams encoded using Google's Protobuf grammars. protobuf_message/2 provides a bi-directional parser that marshals a Prolog structure to WireStream, according to rules specified by Template. It can also unmarshal WireStream into a Prolog structure according to the same grammar. protobuf_message/3 provides a difference list version.
Arguments:
Template- is a protobuf grammar specification. On decode, unbound variables in the Template are unified with their respective values in the WireStream. On encode, Template must be ground.
WireStream- is a code list that was generated by a protobuf encoder using an equivalent template.
bug
- The protobuf specification states that the wire-stream can have the fields in any order and that unknown fields are to be ignored. This implementation assumes that the fields are in the exact order of the definition and match exactly. If you use protobuf_parse_from_codes/3, you can avoid this problem.o
Source protobuf_segment_message(+Segments:list, -WireStream:list(int)) is det[private]
protobuf_segment_message(-Segments:list, +WireStream:list(int)) is det[private]
Low level marshalling and unmarshalling of byte streams. The processing is independent of the .proto description, similar to the processing done by protoc --decode_raw. This means that field names aren't shown, only field numbers.

For unmarshalling, a simple heuristic is used on length-delimited segments: first interpret it as a message; if that fails, try to interpret as a UTF8 string; otherwise, leave it as a "blob" (if the heuristic was wrong, you can convert to a string or a blob by using protobuf_segment_convert/2). 32-bit and 64-bit numbers are left as codes because they could be either integers or floating point (use int32_codes_when/2, float32_codes_when/2, int64_codes_when/2, uint32_codes_when/2, uint64_codes_when/2, float64_codes_when/2 as appropriate); variable-length numbers ("varint" in the Protocol Buffers encoding documentation), might require "zigzag" conversion, int64_zigzag_when/2.

For marshalling, use the predicates int32_codes_when/2, float32_codes_when/2, int64_codes_when/2, uint32_codes_when/2, uint64_codes_when/2, float64_codes_when/2, int64_zigzag_when/2 to put integer and floating point values into the appropriate form.

Arguments:
Segments- a list containing terms of the following form (Tag is the field number; Codes is a list of integers):
  • varint(Tag,Varint) - Varint may need int64_zigzag_when/2
  • fixed64(Tag,Int) - Int signed, derived from the 8 codes
  • fixed32(Tag,Codes) - Int is signed, derived from the 4 codes
  • message(Tag,Segments)
  • group(Tag,Segments)
  • string(Tag,String) - String is a SWI-Prolog string
  • packed(Tag,Type(Scalars)) - Type is one of varint, fixed64, fixed32; Scalars is a list of Varint or Codes, which should be interpreted as described under those items. Note that the protobuf specification does not allow packed repeated string.
  • length_delimited(Tag,Codes)
  • repeated(List) - List of segments

Of these, group is deprecated in the protobuf documentation and shouldn't appear in modern code, having been superseded by nested message types.

For deciding how to interpret a length-delimited item (when Segments is a variable), an attempt is made to parse the item in the following order (although code should not rely on this order):

  • message
  • string (it must be in the form of a UTF string)
  • packed (which can backtrack through the various Types)
  • length_delimited - which always is possible.

The interpretation of length-delimited items can sometimes guess wrong; the interpretation can be undone by either backtracking or by using protobuf_segment_convert/2 to convert the incorrect segment to a string or a list of codes. Backtracking through all the possibilities is not recommended, because of combinatoric explosion (there is an example in the unit tests); instead, it is suggested that you take the first result and iterate through its items, calling protobuf_segment_convert/2 as needed to reinterpret incorrectly guessed segments.

WireStream- a code list that was generated by a protobuf endoder.
See also
- https://developers.google.com/protocol-buffers/docs/encoding
bug
- This predicate is preliminary and may change as additional functionality is added.
Source detag(+Compound, -Name, -Tag, -Value, List, -CompoundWithList) is semidet[private]
Deconstruct Compound or the form Name(Tag,Value) and create a new CompoundWithList that replaces Value with List. This is used by packed_list/2 to transform [varint(1,0),varint(1,1)] to varint(1,[0,1]).

Some of Compound items are impossible for packed with the current protobuf spec, but they don't do any harm.

Source protobuf_segment_convert(+Form1, ?Form2) is multi[private]
A convenience predicate for dealing with the situation where protobuf_segment_message/2 interprets a segment of the wire stream as a form that you don't want (e.g., as a message but it should have been a UTF8 string).

Form1 is converted back to the original wire stream, then the predicate non-deterimisticly attempts to convert the wire stream to a string or length_delimited term (or both: the lattter always succeeds).

The possible conversions are: message(Tag,Segments) => string(Tag,String) message(Tag,Segments) => length_delimited(Tag,Codes) string(Tag,String) => length_delimited(Tag,Codes) length_delimited(Tag,Codes) => length_delimited(Tag,Codes)

Note that for fixed32, fixed64, only the signed integer forms are given; if you want the floating point forms, then you need to do use int64_float64_when/2 and int32_float32_when/2.

For example: ~~~{.pl} ?- protobuf_segment_convert( message(10,[fixed64(13,7309475598860382318)]), string(10,"inputType")). ?- protobuf_segment_convert( message(10,[fixed64(13,7309475598860382318)]), length_delimited(10,[105,110,112,117,116,84,121,112,101])). ?- protobuf_segment_convert( string(10, "inputType"), length_delimited(10,[105,110,112,117,116,84,121,112,101])). ?- forall(protobuf_segment_convert(string(1999,"\x1\\x0\\x0\\x0\\x2\\x0\\x0\\x0\"),Z), writeln(Z)). string(1999,

Source uint32_codes_when(?Uint32, ?Codes) is det[private]
Convert between a 32-bit unsigned integer value and its wirestream codes. This is a low-level predicate; normally, you should use template_message/2 and the appropriate template term.

This predicate delays until either Uint32 or Codes is sufficiently instantiated.

There is also a non-delayed uint32_codes/2

SWI-Prolog doesn't have a 32-bit integer type, so 32-bit integer is simulated by doing a range check.

Arguments:
Uint32- an unsigned integer that's in the 32-bit range
Codes- a list of 4 integers (codes)
Errors
- Type,Domain if Value or Codes are of the wrong type or out of range.
Source int32_codes_when(?Int32, ?Codes) is det[private]
Convert between a 32-bit signed integer value and its wirestream codes. This is a low-level predicate; normally, you should use template_message/2 and the appropriate template term.

This predicate delays until either Int32 or Codes is sufficiently instantiated.

There is also a non-delayed int32_codes/2

SWI-Prolog doesn't have a 32-bit integer type, so 32-bit integer is simulated by doing a range check.

Arguments:
Int32- an unsigned integer that's in the 32-bit range
Codes- a list of 4 integers (codes)
Errors
- Type,Domain if Value or Codes are of the wrong type or out of range.
Source float32_codes_when(?Value, ?Codes) is det[private]
Convert between a 32-bit floating point value and its wirestream codes. This is a low-level predicate; normally, you should use template_message/2 and the appropriate template term.

This predicate delays until either Value or Codes is sufficiently instantiated.

There is also a non-delayed float32_codes/2

Arguments:
Value- a floating point number
Codes- a list of 4 integers (codes)
Source uint64_codes_when(?Uint64, ?Codes) is det[private]
Convert between a 64-bit unsigned integer value and its wirestream codes. This is a low-level predicate; normally, you should use template_message/2 and the appropriate template term.

SWI-Prolog allows integer values greater than 64 bits, so a range check is done.

This predicate delays until either Uint64 or Codes is sufficiently instantiated.

There is also a non-delayed uint64_codes/2

Source int64_codes_when(?Int64, ?Codes) is det[private]
Convert between a 64-bit signed integer value and its wirestream codes. This is a low-level predicate; normally, you should use template_message/2 and the appropriate template term.

SWI-Prolog allows integer values greater than 64 bits, so a range check is done.

This predicate delays until either Int64 or Codes is sufficiently instantiated.

There is also a non-delayed int64_codes/2

Source float64_codes_when(?Value, ?Codes) is det[private]
Convert between a 64-bit floating point value and its wirestream codes. This is a low-level predicate; normally, you should use template_message/2 and the appropriate template term.

This predicate delays until either Value or Codes is sufficiently instantiated.

There is also a non-delayed float64_codes/2

Arguments:
Value- a floating point number
Codes- a list of 8 integers (codes)
Errors
- instantiation error if both Value and Codes are uninstantiated.
bug
- May give misleading exception under some circumstances (e.g., float64_codes(_, [_,_,_,_,_,_,_,_]).
Source int64_zigzag_when(?Original, ?Encoded) is det[private]
Convert between a signed integer value and its zigzag encoding, used for the protobuf sint32 and sint64 types. This is a low-level predicate; normally, you should use template_message/2 and the appropriate template term.

SWI-Prolog allows integer values greater than 64 bits, so a range check is done.

This predicate delays until either Original or Encoded is sufficiently instantiated.

There is also a non-delayed int64_zigzag/2

Arguments:
Original- an integer in the original form
Encoded- the zigzag encoding of Original
Errors
- Type,Domain if Original or Encoded are of the wrong type or out of range.
- instantiation error if both Original and Encoded are uninstantiated.
See also
- https://developers.google.com/protocol-buffers/docs/encoding#types
Source uint64_int64_when(?Uint64:integer, ?Int64:integer) is det[private]
Reinterpret-cast between uint64 and int64. For example, uint64_int64(0xffffffffffffffff,-1).

This predicate delays until either Uint64 or Int64 is sufficiently instantiated.

There is also a non-delayed uint64_int64/2

Arguments:
Uint64- 64-bit unsigned integer
Int64- 64-bit signed integer
Errors
- Type,Domain if Value or Codes are of the wrong type or out of range.
- instantiation error if both Value and Codes are uninstantiated.
Source uint32_int32_when(?Uint32, ?Int32) is det[private]
Reinterpret-case between uint32 and int32.

This predicate delays until either Uint32 or Int32 is sufficiently instantiated.

There is also a non-delayed uint32_int32/2

Arguments:
Uint32- 32-bit unsigned integer (range between 0 and 4294967295).
Int32- 32-bit signed integer (range between -2147483648 and 2147483647).
Errors
- Type,Domain if Int32 or Uint32 are of the wrong type or out of range.
- instantiation error if both UInt32 and Int32 are uninstantiated.
Source int64_float64_when(?Int64:integer, ?Float64:float) is det[private]
Reinterpret-cast between uint64 and float64. For example, int64_float64(3ff0000000000000,1.0).

This predicate delays until either Int64 or Float64 is sufficiently instantiated.

There is also a non-delayed uint64_int64/2

Arguments:
Int64- 64-bit unsigned integer
Float64- 64-bit float
Errors
- Type,Domain if Value or Codes are of the wrong type or out of range.
- instantiation error if both Value and Codes are uninstantiated.
Source int32_float32_when(?Int32:integer, ?Float32:float) is det[private]
Reinterpret-cast between uint32 and float32. For example, int32_float32(0x3f800000,1.0).

This predicate delays until either Int32 or Float32 is sufficiently instantiated.

There is also a non-delayed uint32_int32/2

Arguments:
Int32- 32-bit unsigned integer
Float32- 32-bit float
Errors
- Type,Domain if Value or Codes are of the wrong type or out of range.
- instantiation error if both Value and Codes are uninstantiated.
Source segment_to_term(+ContextType:atom, +Segment, -FieldAndValue) is det[private]
ContextType is the type (name) of the containing message Segment is a segment from protobuf_segment_message/2 TODO: if performance is an issue, this code can be combined with protobuf_segment_message/2 (and thereby avoid the use of protobuf_segment_convert/2)
Source convert_segment(+Type:atom, +ContextType:atom, Tag:atom, ?Segment, ?Value) is det[private]
Compute an appropriate Value from the combination of descriptor "type" (in Type) and a Segment. Reversible on Segment, Values.
Source add_defaulted_fields(+Value0:dict, ContextType:atom, -Value:dict) is det[private]
Source message_field_default(+ContextType:atom, Name:atom, -DefaultValue) is semidet[private]
Source combine_fields(+Fields:list, +MsgDict0, -MsgDict) is det[private]
Combines the fields into a dict and sets missing fields to their default values. If the field is marked as 'norepeat' (optional/required), then the last occurrence is kept (as per the protobuf wire spec) If the field is marked as 'repeat', then all the occurrences are put into a list, in order. This code assumes that fields normally occur all together, but can handle (less efficiently) fields not occurring together, as is allowed by the protobuf spec.
Source combine_fields_repeat(+Fields:list, Field:atom, -Values:list, RestFields:list) is det[private]
Helper for combine_fields/3 Stops at the first item that doesn't match Field - the assumption is that all the items for a field will be together and if they're not, they would be combined outside this predicate.
Arguments:
Fields- a list of fields (Field-Repeat-Value)
Field- the name of the field that is being combined
Values- gets the Value items that match Field
RestFields- gets any left-over fields
Source field_and_type(+ContextType:atom, +Tag:int, -FieldName:atom, -FqnName:atom, -ContextType2:atom, -RepeatOptional:atom, -Type:atom) is det[private]
Lookup a ContextType and Tag to get the field name, type, etc.
Source fqn_repeat_optional(+FqnName:atom, -RepeatOptional:atom) is det[private]
Lookup up proto_meta_field_label(FqnName, _), proto_meta_field_option_packed(FqnName) and set RepeatOptional to one of norepeat, repeat, repeat_packed.
Source fqn_repeat_optional_2(+DescriptorLabelEnum:atom, -RepeatOrEmpty:atom) is det[private]
Map the descriptor "label" to 'repeat' or 'norepeat'. From proto_meta_enum_value('.google.protobuf.FieldDescriptorProto.Label', Label, _).
Source field_descriptor_label_repeated(+Label:atom) is semidet[private]
From proto_meta_enum_value('.google.protobuf.FieldDescriptorProto.Label', 'LABEL_REPEATED', _). TODO: unused
Source field_descriptor_label_single(+Label:atom) is semidet[private]
From proto_meta_enum_value('.google.protobuf.FieldDescriptorProto.Label', Label, _).
Source term_to_segments(+Term:dict, +MessageType:atom, Segments) is det[private]
Recursively traverse a Term, generating message segments
Source field_segment_scalar_or_repeated(+Label, +Packed, +FieldType, +Tag, +FieldTypeName, ?Value, Segment) is det[private]
FieldType is from the .proto meta information ('TYPE_SINT32', etc.)
Source protobuf_field_is_map(+MessageType, +FieldName) is semidet
Succeeds if MessageType's FieldName is defined as a map<...> in the .proto file.
Source protobuf_map_pairs(+ProtobufTermList:list, ?DictTag:atom, ?Pairs) is det
Convert between a list of protobuf map entries (in the form DictTag{key:Key, value:Value} and a key-value list as described in library(pairs). At least one of ProtobufTermList and Pairs must be instantiated; DictTag can be uninstantiated. If ProtobufTermList is from a term created by protobuf_parse_from_codes/3, the ordering of the items is undefined; you can order them by using keysort/2 (or by a predicate such as dict_pairs/3, list_to_assoc/2, or list_to_rbtree/2.

Undocumented predicates

The following predicates are exported, but not or incorrectly documented.

Source protobuf_message(Arg1, Arg2, Arg3)