split document by using MarkLogic mlcp -
i need split document
<?xml version="1.0"?> <!doctype docs system "../rom11.dtd"> <docs> <stwtext id="rd-10-00258" update="03.2011" seq="rq-10-00001"> <head> <ti> <i>j</i> </ti> <ff-list> <ff id="0103" /> </ff-list> </head> <p> symbol für die <vw idref="rd-19-04447">stromdichte</vw> . </p> </stwtext> <stwtext id="rd-10-00209" update="12.2007" seq="rq-10-00223"> <head> <ti>jz</ti> <ff-list> <ff id="0932" /> </ff-list> </head> <p> abkürzung für jod-zahl, siehe <vw idref="rd-06-00645">fettkennzahlen</vw> . </p> </stwtext> </docs>
i command:
~> bin/mlcp.sh import -mode local -host localhost -port 15000 \ -username admin -password admin \ -input_file_path /media/sf_vm.shared/theme/rom-training/v10.new-ml.xml \ -output_uri_replace "/media/sf_vm.shared/theme/rom-training/keywords,'rom-data'" \ -output_collections rom-data \ -input_file_type aggregates -aggregate_record_element stwtext \ -aggregate_uri_id @id
the command works fine, see in marklogic documents ids, don't belong declared stwtext.id, id of last element. example, document expecting see
rd-10-00258 rd-10-00260
but looks this:
0103 0932
is bug, or perhaps did wrong ? thanks
it's bug. if you'd to, can download source code mlcp , change it. take @ aggregatexmlreader.java's processstartelement()
.
Comments
Post a Comment