require 'xml-mixup'
class Anything
include XML::Mixup
end
something = Anything.new
# generate a structure
node = something.markup spec: [
{ '#pi' => 'xml-stylesheet', type: 'text/xsl', href: '/transform' },
{ '#dtd' => :html },
{ '#html' => [
{ '#head' => [
{ '#title' => 'look ma, title' },
{ '#elem' => :base, href: 'http://the.base/url' },
] },
{ '#body' => [
{ '#h1' => 'Illustrious Heading' },
{ '#p' => :lolwut },
] },
], xmlns: 'http://www.w3.org/1999/xhtml' }
]
# `node` will correspond to the last thing generated. In this
# case, it will be a text node containing 'lolwut'.
doc = node.document
puts doc.to_xml
# => <?xml version="1.0"?>
# => <?xml-stylesheet href="/transform" type="text/xsl"?>
# => <!DOCTYPE html>
# => <html xmlns="http://www.w3.org/1999/xhtml">
# => <head>
# => <title>look ma, title</title>
# => <base href="http://the.base/url"/>
# => </head>
# => <body>
# => <h1>Illustrious Heading</h1>
# => <p>lolwut</p>
# => </body>
# => </html>
Some time ago, I wrote a Perl module called Role::Markup::XML. I did this because I had a lot of XML to generate, and was dissatisfied with what was currently on offer. Now I have a lot of XML to generate using Ruby, and found a lot of the same things:
Granted it's a lot nicer to do this sort of thing in Ruby, but at the end of the day, the thing generating the XML is a nested list of method calls — not a declarative data structure.
It's not super-easy to generate a piece of the target document and
then go back and generate some more (although
Nokogiri::XML::Builder.with
is a nice start). This plus the last
point leads to all sorts of cockamamy constructs which are almost as
life-sucking as writing raw DOM routines.
This comes up a lot: you have an existing document and you want to add even just a single node to it — say, in between two nodes just for fun. Good luck with that.
- The input consists of ordinary Ruby data objects so you can build them up ahead of time, in bulk, transform them using familiar operations, etc.,
- Sprinkle pre-built XML subtrees anywhere into the spec so you can memoize repeating elements, or otherwise compile a document incrementally,
- Attach new generated content anywhere: underneath a parent node, or before, after, or instead of a node at the sibling level.
At the heart of this module is a single method called markup
, which,
among other things, takes a :spec
. The spec can be any composite of
these objects, and will behave as described:
The principal construct in XML::Mixup
is the Hash
. You can
generate pretty much any node with it:
{ '#tag' => 'foo' } # => <foo/>
# or, with the element name as a symbol
{ '#element' => :foo } # => <foo/>
# or
{ '#elem' => 'foo' } # => <foo/>
# or, with nil as a key
{ nil => :foo } # => <foo/>
# or, with attributes
{ nil => :foo, bar: :hi } # => <foo bar="hi"/>
# or, with namespaces
{ nil => :foo, xmlns: 'urn:x-bar' } # => <foo xmlns="urn:x-bar"/>
# or, with more namespaces
{ nil => :foo, xmlns: 'urn:x-bar', 'xmlns:hurr' => 'urn:x-durr' }
# => <foo xmlns="urn:x-bar" xmlns:hurr="urn:x-durr"/>
# or, with content
{ nil => [:foo, :hi] } # => <foo>hi</foo>
# or, shove your child nodes into an otherwise content-less key
{ [:hi] => :foo, bar: :hurr } # => <foo bar="hurr">hi</foo>
# or, if you have content and the element name is not a reserved word
{ '#html' => { '#head' => { '#title' => :hi } } }
# => <html><head><title>hi</title></head></html>
# also works with namespaces
{ '#atom:feed' => nil, 'xmlns:atom' => 'http://www.w3.org/2005/Atom' }
# => <atom:feed xmlns:atom="http://www.w3.org/2005/Atom"/>
Reserved hash keywords are: #comment
, #cdata
, #doctype
, #dtd
,
#elem
, #element
, #pi
, #processing-instruction
, #tag
. Note
that the constructs { nil => :foo }
, { nil => 'foo' }
, and { '#foo' => nil }
, plus []
anywhere you see nil
, are all
equivalent.
Attributes are sorted lexically. Composite attribute values get flattened like this:
{ nil => :foo, array: [:a, :b], hash: { e: :f, c: :d } }
# => <foo array="a b" hash="c: d e: f"/>
Note that attribute values can also be a Proc
, which are fed
arbitrary arguments from the markup
method. The Proc
is expected
to return something which can subsequently flattened. If an attribute
value is nil
or ultimately resolves to nil
, or an empty Array
or
Hash
, that attribute will be omitted. nil
values in arrays or
hashes will also be skipped, as will empty-string values in
arrays. This is different behaviour from versions prior to 0.1.10,
where nil
(or, e.g., []
) would produce an attribute containing the
empty string.
This change was made to eliminate a lot of clunky logic in application code to determine whether or not to include a given attribute. If you need to render attributes explicitly with empty strings, then explicitly pass in the empty string.
{ '#pi' => 'xml-stylesheet', type: 'text/xsl', href: '/transform' }
# => <?xml-stylesheet type="text/xsl" href="/transform"?>
# or, if you like typing
{ '#processing-instruction' => :hurr } # => <?hurr?>
{ '#dtd' => :html } # => <!DOCTYPE html>
# or (note either :public or :system can be nil)
{ '#dtd' => [:html, :public, :system] }
# => <!DOCTYPE html PUBLIC "public" SYSTEM "system">
# or, same thing
{ '#doctype' => :html, public: :public, system: :system }
Comments and CDATA
are flattened into string literals:
{ '#comment' => :whatever } # => <!-- whatever -->
{ '#cdata' => '<what-everrr>' } # => <![CDATA[<what-everrr>]]>
Pretty straight forward?
Parts of a spec that are arrays (or really anything that can be turned into one) are attached at the same level of the document in the sequence given, as you might expect.
These are automatically cloned, but otherwise passed in as-is.
These are executed with any supplied :args
, and then markup
is run
again over the result. (Take care not to supply a Proc
that produces
another Proc
.)
Turned into a text node.
Generated and deposited in the usual place.
Come on, you know how to do this:
$ gem install xml-mixup
Or, download it off rubygems.org.
Bug reports and pull requests are welcome at the GitHub repository.
As mentioned, this is pretty much a straight-across port
of Role::Markup::XML,
where it makes sense in Perl to bolt a bunch of related pseudo-private
_FOO
-looking instance methods onto an object so you can use them to
make more streamlined methods. This may or may not make the same kind
of sense with Ruby.
In particular, these methods do not touch the calling object's state. In fact they should be completely stateless and side-effect free. Likewise, they are really meant to be private. As such, it may make sense to simply bundle them as class methods and use them as such. I don't know, I haven't decided yet.
This software is provided under the Apache License, 2.0.