jgm Posted March 21, 2006 Posted March 21, 2006 Hi, I'm new to all of this - vb, .net, xml, etc. In fact the last programming I did was about 10 years ago. So please forgive any noobie-ness. I'm using vb in vs 2005, and I want to read in xml files, validate them against a schema, and insert the data into a sql server express database. The files are quite large - 20-50MB, and I'm told could get as large as 100MB+ Here's my questions: Should I load the xml into a xmlDataDocument / dataset? Will it work ok with such large files? If I do this, Am I correct to think I'll need to create a new dataset schema which will define the relationships between the tables? Or should I step through the xml node by node, and insert one row at a time into the database? Obviously I realise this is not very OO - there must be a better way...? The xml data has no key / id columns. The only unique identifier would be a combination of columns. If I take the second approach, I understand I could retrieve the last row ID from the database table and increment it for each new row I insert. If I use a dataset, how can I do this? Sorry if these are basic questions, like I said - I'm new to this. Any advice or links to examples will be very appreciated. Thanks!! JM Heres the existing tables & schema. (I cannot change the schema, although I think it could probably be improved) orders table OrderID - Key OrderField1 OrderField2 OrderField3 OrderDateTime OrderField4 OrderItems table ItemID - Key OrderID - FK ItemField1 ItemField2 ItemField3 ItemField4 <?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="orders"> <xs:complexType> <xs:sequence> <xs:element name="order_details" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="orderfield1" type="xs:integer"/> <xs:element name="orderfield2" type="xs:string"/> <xs:element name="orderfield3" type="xs:string"/> <xs:element name="orderdate"> <xs:complexType> <xs:sequence> <xs:element name="year" type="xs:integer"/> <xs:element name="month" type="xs:integer"/> <xs:element name="day" type="xs:integer"/> <xs:element name="hour" type="xs:integer"/> <xs:element name="minute" type="xs:integer"/> <xs:element name="second" type="xs:integer"/> </xs:sequence> </xs:complexType> </xs:element> <xs:element name="orderfield4"> <xs:simpleType> <xs:restriction base="xs:string"> <xs:enumeration value="option1"/> <xs:enumeration value="option2"/> <xs:enumeration value="option3"/> </xs:restriction> </xs:simpleType> </xs:element> <xs:element name="order_items" maxOccurs="unbounded"> <xs:complexType> <xs:sequence> <xs:element name="itemfield1" type="xs:string"/> <xs:element name="itemfield2" type="xs:string"/> <xs:element name="itemfield3" type="xs:string"/> <xs:element name="itemfield4" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> Quote
Administrators PlausiblyDamp Posted March 21, 2006 Administrators Posted March 21, 2006 If you can load the XML into a DataSet / DataTable then you could always take advantage of the System.Data.SqlClient.SqlBulkCopy class to make the actual inserting easier. If the documents get too large (really depends on available RAM) then an alternate solution may be required. Also if the existing XML already has a defined schema you can use that within VS - otherwise you will benefit from creating a schema / strongly typed DataSet within VS. Quote Posting Guidelines FAQ Post Formatting Intellectuals solve problems; geniuses prevent them. -- Albert Einstein
jgm Posted March 21, 2006 Author Posted March 21, 2006 (edited) Thanks for your comments. I'll check out the SqlBulkCopy class. I'd rather go the dataset way - it seems the correct approach and less work. I'm just not sure how to maintain the relationships between the tables. Should I derive a dataset from the tables or from the schema? Also if the existing XML already has a defined schema you can use that within VS - otherwise you will benefit from creating a schema / strongly typed DataSet within VS. Is this just a matter of taking the existing schema and adding codegen & msdata attributes to define the relationships etc? What about adding ID / key fields? All the examples I've seen have a unique ID element in each record in the xml already. Thanks again, JM Edited March 22, 2006 by jgm Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.