[SoapRMI] Error reading big cdata chunks with XPP2
Timo Thomas
timothomas_at_web.de
Tue, 09 Sep 2003 13:12:55 +0200
Aleksander Slominski wrote:
> obtaining offset in file is one of features that i have coded in XPP2
> and also patched Xerces2 to provide this information but not many people
> found it useful and it was not applied in Xerces2 CVS (you can find it
> discussed in xerces-j-dev mailing linst).
What a pity. I find it very useful to have a fast access to
XML-fragments stored in files, in my case Log4J-logs. With the byte
position at hand, I can issue a skip() operation (which is O(1) on most
file systems) to the input stream before passing it to the XML parser
via setInput().
>> Unfortunately, your parser seems to read bytes in advance to parsing
>> them (I've installed a counter in the InputStream to test this). Do you
>> know a way to circumvent this caching or any other way to get the
>> offsets?
>
>
> yes - use your custom InputStream that will allow to read only one byte
> a time for all read() functions. using this technique parsing is a bit
> slower but you will have exact count if how many bytes were read.
>
> let me know if it worked.
I didn't try but probably you're right. I set up a second input stream
instead that counts the new line characters, reading the stream in 64K
blocks. Hope this is as fast as (or faster than) your proposed solution,
but I don't have the time at the moment to test this.
Thanks,
Timo