[SoapRMI] [ANN] MXP1 new fast and small parsing engine for XMLPULL
Aleksander Slominski
aslom_at_cs.indiana.edu
Tue, 16 Apr 2002 18:24:59 -0500
hi,
i have completely rewritten XPP3 paring engine for XMLPULL that is
now called MXP1 and is available from:
http://www.extreme.indiana.edu/xgws/xsoap/xpp/mxp1/
the size of complete MXP1 parser (without factory but can be used directly) is less than 20 KB.
to estimate how MXP1 is performing i have used SAX benchmark
from http://piccolo.sourceforge.net/bench.html that i have
modified to run tests for XMLPULL and added also following features:
* modified tests to parse from memory and not from file to eliminate IO interference
* each test actual visits every element and its content is added to StringBuffer
(to allow checking real time to visit every node!)
* added ability to check overhead of creation of parser instances instead of reusing
you can get modified tests from http://www.extreme.indiana.edu/~aslom/xpp_sax2bench/
actual test results are at: http://www.extreme.indiana.edu/~aslom/xpp_sax2bench/results.html
in all but two tests MXP1 is the fastest parser by about 5-20% than second fastest Piccolo
but MXP1 is slower than Piccolo for 'Mostly text' and 'Random XML' as MXP1 will report
text always combined as one event. that means that in application there is really no need ot use
StringBuffer to collect element content. i have kept string buffer in all tests for symmetry but
removing it will speed up test to the same level as Piccolo,
for example when USE_SB flag is false in XmlPullTest:
C:\Forge\homepage\xpp_sax2bench>java -cp classes;parsers\xpp3_mxp1_beta1.jar XmlPullTest
data\rand_100.xml 2000 ns_on
using factory class org.xmlpull.mxp1.MXParserFactory
namespaces: true
reuse parser instances: true
using parser class org.xmlpull.mxp1.MXParserCachingStrings
Warming up the parser....
count=1220
Parsing data\rand_100.xml 2000 times by XmlPullTest
Elapsed time: 7801ms
Average parse time: 3.9005ms
<benchmark elapsed="7801" iterations="2000"/>
and changing the flag to true will slow down parser by about 10%:
C:\Forge\homepage\xpp_sax2bench>java -cp classes;parsers\xpp3_mxp1_beta1.jar XmlPullTest
data\rand_100.xml 2000 ns_on
using factory class org.xmlpull.mxp1.MXParserFactory
namespaces: true
reuse parser instances: true
using parser class org.xmlpull.mxp1.MXParserCachingStrings
Warming up the parser....
count=1093
Parsing data\rand_100.xml 2000 times by XmlPullTest
Elapsed time: 8753ms
Average parse time: 4.3765ms
<benchmark elapsed="8753" iterations="2000"/>
and here is what Picoolo 0.8 is reporting:
C:\Forge\homepage\xpp_sax2bench>java -cp classes;parsers/Piccolo-0.8.jar
-Dorg.xml.sax.driver=com.bluecast.xml.Piccolo SAX2Test data\rand_100.xml 2000 ns_on
using parser class com.bluecast.xml.Piccolo
namespaces: true
reuse parser instances: true
Warming up the parser....
count=1174
Parsing data\rand_100.xml 2000 times by SAX2Test
Elapsed time: 7511ms
Average parse time: 3.7555ms
<benchmark elapsed="7511" iterations="2000"/>
comments about MXP1 and test are welcome.
thanks,
alek