[SoapRMI] Setting parsing position in PullParser
John Morrow
john_at_jeecom.com
Mon, 4 Feb 2002 12:19:08 -0000
Hi,
I'm using Pull Parser v2 to process some xml containing olap database data
and produce a html table from the results. I have the data in a string and
the data contains a number of <Tuple> tags, which contain other nested tags,
e.g.
...
<Tuples>
<Tuple>
<Member Hierarchy="Education Level">
<UName>[Education Level].[All Education Level].[Bachelors
Degree]</UName>
<Caption>Bachelors Degree</Caption>
<LName>[Education Level].[Education Level]</LName>
<LNum>1</LNum>
<DisplayInfo>131072</DisplayInfo>
</Member>
<Member Hierarchy="Gender">
<UName>[Gender].[All Gender].[Female]</UName>
<Caption>Female</Caption>
<LName>[Gender].[Gender]</LName>
<LNum>1</LNum>
<DisplayInfo>131072</DisplayInfo>
</Member>
<Member Hierarchy="Marital Status">
<UName>[Marital Status].[All Marital
Status].[Married]</UName>
<Caption>Married</Caption>
<LName>[Marital Status].[Marital Status]</LName>
<LNum>1</LNum>
<DisplayInfo>131072</DisplayInfo>
</Member>
</Tuple>
<Tuple>
<Member Hierarchy="Education Level">
<UName>[Education Level].[All Education Level].[Bachelors
Degree]</UName>
<Caption>Bachelors Degree</Caption>
<LName>[Education Level].[Education Level]</LName>
<LNum>1</LNum>
<DisplayInfo>131072</DisplayInfo>
</Member>
<Member Hierarchy="Gender">
<Caption>Female</Caption>
<LName>[Gender].[Gender]</LName>
<LNum>1</LNum>
<DisplayInfo>131072</DisplayInfo>
</Member>
<Member Hierarchy="Marital Status">
<UName>[Marital Status].[All Marital
Status].[Single]</UName>
<Caption>Single</Caption>
<LName>[Marital Status].[Marital Status]</LName>
<LNum>1</LNum>
<DisplayInfo>131072</DisplayInfo>
</Member>
</Tuple>
<Tuple>
...
Each tuple contains the heading for a single row of data, in this case, each
row has 3 levels of headings (one per <Member>)....and this data would be
used to generate a table something like this for the first 5 tuple tags:
--------------------------------------------------
| Bachelors | Female | Married ( row no1)
| Degree | |-------------------------
| | | Single
(row no2)
| |----------|-------------------------
| | Male | Married (row no3)
| | |-------------------------
| | | Single
(row no4 )
|--------------|----------|-------------------------
| Graduate | Female | Married (row no5)
| Degree | |-------------------------
...
Instead of printing the each <Member>'s caption each time on each row,
common ones are printed in a single cell spanning the appropriate number of
rows. To produce this kind of table in html, when I'm outputting the
headings for row no1 I need to know that the "Bachelors Degree" cell's
rowspan is 4 and that the "Female" cell has rowspan 2 and "Married" has
rowspan 1. However, it's not until I get to the 5th tuple and see that the
Education Level member has changed from "Bachelors Degree" to "Graduate
Degree" that I can determine that the rowspan for "Bachelors Degree" is 4,
so I need to read ahead a few tuples. Then when I want to output the second
row which contains only 1 new cell ("Single"), I need to access this data
from tuple no 2 but I've already read past it.
One option I have is to save the information about tuples 2 3 and 4 when I'm
reading ahead to tuple no 5 and then for row 2 just use my saved data. In
the above case that's not a big overhead, however, some olap tables can be
extreemly large and have cells with very big spans and many levels so this
could take up a lot of memory.
A second option I was trying to get working was to rewind back in the String
after I've done my read ahead. When I create my PullParser, I call
etInput( reader ), passing in a StringReader object. What I've tried to do
(and this is what my question is about!) is to call mark() on the
StringReader at the end of reading the tuple for the current row and then
read ahead however many tuples necessary to figure out the rowspan and then
call reset() on the StringReader so that it's ready for the following row.
This probably isn't the most stable thing to do as I get errors later on
saying </Tuple> tags were found where </Member> tags were expected etc. I
then tried also calling reset() on the pull parser but I then get errors:
org.gjt.xpp.XmlPullParserException: only whitespace content allowed
outside root element at line 2 and column 17 seen ">\n "...
(parser state CONTENT)
Does anyone have any experience of dealing with a similar parsing
situation?, or know if reset()ting / rewinding can be done in this way. Or,
any ideas on a better way of solving this would be greatly appreciated.
Cheers,
John.