[SoapRMI] Error reading big cdata chunks with XPP2

Timo Thomas timothomas_at_web.de
Tue, 09 Sep 2003 13:12:55 +0200


Aleksander Slominski wrote:

> obtaining offset in file is one of features that i have coded in XPP2 
> and also patched Xerces2 to provide this information but not many people 
> found it useful and it was not applied in Xerces2 CVS (you can find it 
> discussed in xerces-j-dev mailing linst).

What a pity. I find it very useful to have a fast access to 
XML-fragments stored in files, in my case Log4J-logs. With the byte 
position at hand, I can issue a skip() operation (which is O(1) on most 
file systems) to the input stream before passing it to the XML parser 
via setInput().

>> Unfortunately, your parser seems to read bytes in advance to parsing
>> them (I've installed a counter in the InputStream to test this). Do you
>> know a way to circumvent this caching or any other way to get the 
>> offsets?
> 
> 
> yes - use your custom InputStream that will allow to read only one byte 
> a time for all read() functions. using this technique parsing is a bit 
> slower but you will have exact count if how many bytes were read.
> 
> let me know if it worked.

I didn't try but probably you're right. I set up a second input stream 
instead that counts the new line characters, reading the stream in 64K 
blocks. Hope this is as fast as (or faster than) your proposed solution, 
but I don't have the time at the moment to test this.

Thanks,
Timo