exml is an Erlang library for parsing XML streams and doing complex XML structures manipulation.
exml is a rebar3-compatible OTP application, run make
or ./rebar3 compile
in order to build it. A C++11 compiler is required.
exml can parse both XML streams as well as single XML documents at once.
To parse a whole XML document:
{ok, Parser} = exml:parse(<<"<my_xml_doc/>">>).
To generate an XML document from Erlang terms:
El = #xmlel{name = <<"foo">>,
attrs = [{<<"attr1">>, <<"bar">>}],
children = [{xmlcdata, <<"Some Value">>}]},
exml:to_list(El).
or (pastable into erl
shell):
El = {xmlel, <<"foo">>,
[{<<"attr1">>, <<"bar">>}],
[{xmlcdata, <<"Some Value">>}]}.
exml:to_list(El).
Which results in:
<foo attr1='bar'>Some Value</foo>
exml:to_binary/1
works similarly.
There's also exml:to_pretty_iolist/1,3
for a quick'n'dirty document preview (pastable into erl
):
rr("include/exml.hrl").
El = #xmlel{name = <<"outer">>,
attrs = [{<<"attr1">>, <<"val1">>},
{<<"attr2">>, <<"val-two">>}],
children = [#xmlel{name = <<"inner-childless">>},
#xmlel{name = <<"inner-w-children">>,
children = [#xmlel{name = <<"a">>}]}]}.
io:format("~s", [exml:to_pretty_iolist(El)]).
which prints:
<outer attr2='val-two' attr1='val1'>
<inner-childless/>
<inner-w-children>
<a/>
</inner-w-children>
</outer>
For an example of using the streaming API see test/exml_stream_tests.erl
.
The exml_query
module exposes powerful helper functions to navigate the tree, please refer to the documentation available.
The implementation uses C++ thread-local memory pools of size 10MB by default (override RAPIDXML_STATIC_POOL_SIZE
and/or RAPIDXML_DYNAMIC_POOL_SIZE
at compilation time if desired differently), to maximise cache locality and memory allocation patterns. To also improve performance, the NIF calls are not checking input size, nor timeslicing themselves, nor running in dirty schedulers: that means that if called with too big inputs, the NIFs can starve the VM. It's up to the dev to throttle the input sizes and fine-tune the memory pool sizes.