feurich Posted October 31, 2005 Posted October 31, 2005 Hi There, Can anyone help me out with this one. I need to read an XML file and convert it in to an other XML file with less tags. The original XML file contains information on single Page TIFF files and I need to convert them in to MultiPage TIFF files. This is not the problem. The problem is to read the original XML file with x amount of IMAGEFILE Tags en convert that in to an XML File with only 1 IMAGEFILETAG. The original XML File also contain Meta-information about the documents and relational information between the singlepage TIFF files, that information needs to be preserved. My thoughts where, reading the XML file into an DataSet and then process the Single Page TIFFS in to an Multipage TIFF and then export the data in the dataset to an new XML file. But how do I remove the dataset entries Singlepage TIFFS and replace them with a single Dataset entry for the Multipage TIFF? Any help is usefull. Cire Quote Trust the Universe
Administrators PlausiblyDamp Posted October 31, 2005 Administrators Posted October 31, 2005 Could you post an example of both formats? If they are not too different then you may be able to get away with a bit of xslt. Quote Posting Guidelines FAQ Post Formatting Intellectuals solve problems; geniuses prevent them. -- Albert Einstein
feurich Posted October 31, 2005 Author Posted October 31, 2005 Original XML File. <DOCUMENTS> <VERSION>2.1</VERSION> <LICENTIEHOUDER>Hummingbird</LICENTIEHOUDER> <XTN>Bulkimport</XTN> <ARCHIEFNAAM>Hummingbird archief</ARCHIEFNAAM> <DOCINDEXNAAM1>Organisatie</DOCINDEXNAAM1> <DOCINDEXNAAM2>Persoon extern</DOCINDEXNAAM2> <DOCINDEXNAAM3>Intern</DOCINDEXNAAM3> <DOCINDEXNAAM4>Project</DOCINDEXNAAM4> <DOCINDEXNAAM5>Soort</DOCINDEXNAAM5> <DOCINDEXNAAM6>Onderwerp</DOCINDEXNAAM6> <DOCINDEXNAAM7>Trefwoorden</DOCINDEXNAAM7> <DOCINDEXNAAM8>Documentdatum</DOCINDEXNAAM8> <DOCUMENT> <BRON>2.xx conversie Original DocId: 7875</BRON> <INDEXEERDATUM>2001-03-07</INDEXEERDATUM> <DOCINDEXWAARDE1>Index01</DOCINDEXWAARDE1> <DOCINDEXWAARDE2>Index02</DOCINDEXWAARDE2> <DOCINDEXWAARDE3>Index03</DOCINDEXWAARDE3> <DOCINDEXWAARDE4>Index04</DOCINDEXWAARDE4> <DOCINDEXWAARDE5>Index05</DOCINDEXWAARDE5> <DOCINDEXWAARDE6>Index06</DOCINDEXWAARDE6> <DOCINDEXWAARDE7>2001-03-07</DOCINDEXWAARDE7> <FILE>SinglePage1.TIF</FILE> <TYPE>TIF</TYPE> <FILE>SingelPage2.TIF</FILE> <TYPE>TIF</TYPE> </DOCUMENT> </DOCUMENTS> Converted XML File <DOCUMENTS> <VERSION>2.1</VERSION> <LICENTIEHOUDER>Hummingbird</LICENTIEHOUDER> <XTN>Bulkimport</XTN> <ARCHIEFNAAM>SinglePage Tiff archief</ARCHIEFNAAM> <DOCINDEXNAAM1>Organisatie</DOCINDEXNAAM1> <DOCINDEXNAAM2>Persoon extern</DOCINDEXNAAM2> <DOCINDEXNAAM3>Intern</DOCINDEXNAAM3> <DOCINDEXNAAM4>Project</DOCINDEXNAAM4> <DOCINDEXNAAM5>Soort</DOCINDEXNAAM5> <DOCINDEXNAAM6>Onderwerp</DOCINDEXNAAM6> <DOCINDEXNAAM7>Trefwoorden</DOCINDEXNAAM7> <DOCINDEXNAAM8>Documentdatum</DOCINDEXNAAM8> <DOCUMENT> <BRON>conversie Original DocId: 7875</BRON> <INDEXEERDATUM>2001-03-07</INDEXEERDATUM> <DOCINDEXWAARDE1>Index01</DOCINDEXWAARDE1> <DOCINDEXWAARDE2>Index02</DOCINDEXWAARDE2> <DOCINDEXWAARDE3>Index03</DOCINDEXWAARDE3> <DOCINDEXWAARDE4>Index04</DOCINDEXWAARDE4> <DOCINDEXWAARDE5>Index05</DOCINDEXWAARDE5> <DOCINDEXWAARDE6>Index06</DOCINDEXWAARDE6> <DOCINDEXWAARDE72001-03-07</DOCINDEXWAARDE7 <FILE>MultiPage.TIF</FILE> <TYPE>TIF</TYPE> </DOCUMENT> </DOCUMENTS> Note the <FILE> TAGS. Sorry for the bad markup. Couldn't find the XML markup TAG :-) Quote Trust the Universe
Administrators PlausiblyDamp Posted October 31, 2005 Administrators Posted October 31, 2005 If more than one or tag exists then you simply want to keep the first and ignore the rest is that correct? Quote Posting Guidelines FAQ Post Formatting Intellectuals solve problems; geniuses prevent them. -- Albert Einstein
feurich Posted November 1, 2005 Author Posted November 1, 2005 If by ignore you mean remove then yes. If by ignore you mean leave them in the XML and don't use them no. In the converted XML File there has to be only one <FILE> and <TYPE> Tag Quote Trust the Universe
Administrators PlausiblyDamp Posted November 1, 2005 Administrators Posted November 1, 2005 (edited) A stylesheet similar to should do the trick (not really tested it). If you need to do this via code then have a look at the System.Xml.Xsl.XslTransform class as this will allow you to load an xml document and a stylesheet and save the resultant document somewhere. Edited November 1, 2005 by PlausiblyDamp Quote Posting Guidelines FAQ Post Formatting Intellectuals solve problems; geniuses prevent them. -- Albert Einstein
feurich Posted November 1, 2005 Author Posted November 1, 2005 Ok, If I understand this thing wright: 1) Create an Create an XsltTransform Object 2) Load the StyleSheet 3) Create an XPAth Document 4) Load the XML Data in to de Xpath Document 5) Transform the data The thing I don't understand is: I can load the original XML data in to an Xpath document and through the transform I can 'export' it to an other xml file with the same deminesions as the Stylesheet. But how does the stylesheet 'knows' which TAGS need to be merged? or am I getting it completly wrong..? :confused: Quote Trust the Universe
Administrators PlausiblyDamp Posted November 1, 2005 Administrators Posted November 1, 2005 Just updated the code in the original - the previous didn't generate the Multipage tif entry correctly - you might want to try the new version. Quote Posting Guidelines FAQ Post Formatting Intellectuals solve problems; geniuses prevent them. -- Albert Einstein
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.