Friday, October 03, 2008

Graph Algorithms in XSLT

Notwithstanding that XSLT can be considered a functional programming language, you probably wouldn't choose to implement algorithms in it. Nonetheless, I recently had to split a very large XML document into smaller pieces (using xsl:result-document), and I wanted to redundancies between the outputs. So what I essentially wanted to do was separate elements in the input into "strongly-connected components." XSLT 2.0 provides lots of operators on sequences and sets, and allows you to define property functions, but what it does not lend itself to is the creation of new data structures. To be more precise, the data structures you'd need for depth-first search or Tarjan's disjoint sets would require you to copy elements into new structures. Easy, if you're working in a language that has pointer or reference equality; not possible in XLST. However, if you assume that the XSLT processor will use hash tables to represent sets, and you put that together with <xsl:key>, you can try a different approach.

Here's the input I want to process:

<?xml version="1.0"?>
<data>
<documents>
<document id="a">
<related-entities>1 2 3</related-entities>
</document>
<document id="b">
<related-entities>2 3 4</related-entities>
</document>
<document id="c">
<related-entities>5 6 7</related-entities>
</document>
<document id="d">
<related-entities>7 8 9</related-entities>
</document>
<document id="e">
<related-entities>10</related-entities>
</document>
</documents>
<entities>
<entity id="1"/>
<entity id="2"/>
<entity id="3"/>
<entity id="4"/>
<entity id="5"/>
<entity id="6"/>
<entity id="7"/>
<entity id="8"/>
<entity id="9"/>
<entity id="10"/>
</entities>
</data>

I'm going to create a hashmap (actually, a hashmultimap) from entity IDs to documents, and vice versa. Note that the keys have to be of an atomic value type, and the values have to be elements. You can't map, say, integers to strings. Alright: for any document I can get the list of entity IDs, and use them as keys pointing back to the document element. So if any two documents point to the same entity, those two documents (and, obviously, the entity as well) must belong in the same strongly-connected component. For a sequence of entity IDs I can create a set of document elements; I then have to determine if this set intersects with the set I've already accumulated. If yes, I union both sets and recurse; otherwise I spit out the subgraph, pop the stack, and resume with the next, as-yet-unselected document.

<?xml version="1.0"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="subgraphs">
<xsl:key name="linked-documents" match="/data/documents/document" use="tokenize(related-entities, ' ')"/>
<xsl:key name="linked-entities" match="/data/entities/entity" use="key('linked-documents', @id)/@id"/>
<xsl:template match="document">
<document id="{@id}"/>
</xsl:template>
<xsl:template name="create-subgraph">
<xsl:param name="subgraph" as="element(document)*"/>
<xsl:param name="selection" as="element(document)*" />
<xsl:message select="string-join($subgraph/@id, ',')" />
<xsl:message select="string-join($selection/@id, ',')" />
<xsl:choose>
<!-- If there are no more elements to select from, the subgraph is complete (even if empty). -->
<xsl:when test="not($selection)">
<subgraph>
<xsl:apply-templates select="$subgraph"/>
</subgraph>
</xsl:when>
<xsl:otherwise>
<xsl:variable name="linked-entities" as="element(entity)*" select="for $d in $subgraph return key('linked-entities', $d/@id)"/>
<xsl:message select="string-join($linked-entities/@id, ',')" />
<xsl:variable name="subselection" select="$selection[key('linked-entities', @id) intersect $linked-entities]"/>
<xsl:message select="string-join($subselection/@id, ',')" />
<xsl:choose>
<xsl:when test="$subselection">
<xsl:call-template name="create-subgraph">
<xsl:with-param name="selection" select="$selection except $subselection"/>
<xsl:with-param name="subgraph" select="$subgraph|$subselection"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<subgraph>
<xsl:apply-templates select="$subgraph"/>
</subgraph>
<xsl:variable name="new-selection" select="$selection except $subselection"/>
<xsl:call-template name="create-subgraph">
<xsl:with-param name="selection" select="$new-selection[position() gt 1]"/>
<xsl:with-param name="subgraph" select="$new-selection[1]"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

<xsl:template match="/">
<subgraphs>
<xsl:call-template name="create-subgraph">
<!-- Make the first document element an SCC. -->
<xsl:with-param name="subgraph" select="/data/documents/document[1]"/>
<xsl:with-param name="selection" select="/data/documents/document[position() gt 1]"/>
</xsl:call-template>
</subgraphs>
</xsl:template>
</xsl:stylesheet>


In the end, sure enough, I get:

<subgraphs xmlns:fn="subgraphs" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<subgraph>
<document id="a"/>
<document id="b"/>
</subgraph>
<subgraph>
<document id="c"/>
<document id="d"/>
</subgraph>
<subgraph>
<document id="e"/>
</subgraph>
</subgraphs>

Yes, defining one mapping in terms of another is allowed in XSLT 2.0. The problem I had at work was actually more complicated: the entities were in entirely separate XML documents. The trick there, which cost me some sweat, was to define one mapping, which pointed from a document immediately to other documents. However, I'll put off that explanation until someone asks for it.

21 comments:

Anonymous said...

I got this site from my pal who informed me regarding this site and now this time I am browsing this web page and reading very informative posts here.
Here is my web page ... Student strips down and dance

Anonymous said...

I don't even know how I ended up here, but I thought this post was great. I do not know who you are but certainly you are going to a famous blogger if you are not already ;) Cheers!
Here is my blog post : free porn

Anonymous said...

Your way of telling everything in this post is genuinely good, all can
easily understand it, Thanks a lot.
Also visit my page : click the following webpage

Anonymous said...

I enjoy what you guys are up too. This kind of clever work and reporting!
Keep up the wonderful works guys I've added you guys to blogroll.
Visit my blog : http://www.teenpornpost.com/wild-overnight-rest

Anonymous said...

If some one needs expert view regarding blogging and
site-building then i advise him/her to pay a quick visit this blog,
Keep up the fastidious work.
Feel free to visit my weblog :: www.xxxvideofix.com

Anonymous said...

Hey there! I know this is somewhat off-topic but I needed to
ask. Does managing a well-established blog like yours take a large amount of work?
I'm brand new to running a blog however I do write in my journal on a daily basis. I'd like to start a blog so I can
easily share my experience and thoughts
online. Please let me know if you have any suggestions or
tips for new aspiring bloggers. Appreciate it!
Also see my webpage :: Bubbly Blonde On The Casting Couch

Anonymous said...

Saved as a favorite, I really like your blog!
Feel free to surf my web blog ... Amateur Free Porn

Anonymous said...

It's hard to come by educated people in this particular topic, however, you seem like you know what you're talking about!

Thanks
Feel free to visit my site - free teen porn

Anonymous said...

Malaysia & Singapore & brunei greatest online blogshop for wholesale & quantity korean add-ons, accessories,
earstuds, choker, rings, bracelet, bracelet & hair add-ons.
Deal 35 % wholesale discount. Ship Worldwide
My webpage : purchase new

Anonymous said...

top [url=http://www.001casino.com/]casino bonus[/url] brake the latest [url=http://www.casinolasvegass.com/]casino games[/url] unshackled no set aside perk at the leading [url=http://www.baywatchcasino.com/]www.baywatchcasino.com
[/url].

Anonymous said...

What's Going down i'm new to this, I stumbled upon this I've discovered It positively helpful and it has aided me out loads. I hope to contribute & aid other customers like its helped me. Good job.

My web site ... diet that works

Anonymous said...

Hurrah, that's what I was exploring for, what a information! existing here at this website, thanks admin of this site.

my web-site; music

Anonymous said...

Hi to all, how is the whole thing, I think every one is getting more from this web page, and your
views are pleasant in support of new people.

Feel free to surf to my web page :: torrents download speed

Anonymous said...

I am regular reаder, hoω are уou everуbody?
This pаragrаph posted at this websitе is tгuly pleasаnt.


My ρage ... Life Insurance quotes

Anonymous said...

I was extrеmelу pleаsed tο uncover this website.
I want to to thаnk you foг your timе due to this wonderful read!
! ӏ definitely really likеԁ every
little bit of it and I have you booκ-marked to
see new informatіon on your web site.

Ѕtοp bу my homepage - raspberry ketones

Anonymous said...

Every weekend i used to pay a visit this web site, because
i want enjoyment, as this this site conations genuinely nice funny information too.



Stop by my blog; www.sex4it.com

Anonymous said...

Hello there! This іs kіnԁ
of off topic but I neeԁ some help from an establiѕhed blog.
Iѕ it ѵегy hаrԁ to ѕet up
your οwn blog? I'm not very techincal but I can figure things out pretty quick. I'm
thinking about setting uρ mу own but I'm not sure where to start. Do you have any points or suggestions? Many thanks

My blog post - Pittsburgh florist

Anonymous said...

Мagnificent beat ! I would like to apprentіce even
as yоu аmend your site, hοw could і ѕubѕcгіbe for а wеblog sitе?

The асcount aided me a applicable deаl.
Ӏ hаve been a lіttle bit acquaintеd of this yοur bгοadcаst offerеd bright clear idea

my ѕite hgh supplements

Anonymous said...

I'm amazed, I must say. Seldom do I encounter a blog that's both еquallу educаtive
and amuѕіng, аnd lеt me tell you, уou
haνe hit the nail оn the head. The pгoblеm
іѕ somеthing which not еnοugh
men anԁ ωomen are ѕpeаκіng іntеlligеntlу about.
I am verу hаppу І cаme
асross thіѕ in my hunt for somеthing conсerning this.



Feel free to ѕurf to my webpage ... green coffee bean extract

Anonymous said...

With havin ѕo much written content do you еνer гun
into any problemѕ of ρlagorism оr copyright
violation? My blog haѕ a lοt оf completely unique content
I've either created myself or outsourced but it seems a lot of it is popping it up all over the internet without my authorization. Do you know any solutions to help reduce content from being ripped off? I'd ԁefinitеly appгeciate
it.

Here is my web blog: dr oz green coffee bean extract dosage

Anonymous said...

I love your blog.. veгy nice cοloгs & theme.
Did you dеsign this website yourself or did you hіre someοnе to do it fоr you?
Plz respond aѕ I'm looking to construct my own blog and would like to find out where u got this from. thank you

my homepage :: car transport