Jump to content
Xtreme .Net Talk

Recommended Posts

Posted

Hi,

 

I'm new to all of this - vb, .net, xml, etc. In fact the last programming I did was about 10 years ago. So please forgive any noobie-ness.

 

I'm using vb in vs 2005, and I want to read in xml files, validate them against a schema, and insert the data into a sql server express database.

 

The files are quite large - 20-50MB, and I'm told could get as large as 100MB+

 

Here's my questions:

 

Should I load the xml into a xmlDataDocument / dataset? Will it work ok with such large files? If I do this, Am I correct to think I'll need to create a new dataset schema which will define the relationships between the tables?

 

Or should I step through the xml node by node, and insert one row at a time into the database? Obviously I realise this is not very OO - there must be a better way...?

 

The xml data has no key / id columns. The only unique identifier would be a combination of columns. If I take the second approach, I understand I could retrieve the last row ID from the database table and increment it for each new row I insert. If I use a dataset, how can I do this?

 

 

Sorry if these are basic questions, like I said - I'm new to this. Any advice or links to examples will be very appreciated.

 

Thanks!!

 

JM

 

Heres the existing tables & schema. (I cannot change the schema, although I think it could probably be improved)

 

orders table

OrderID - Key

OrderField1

OrderField2

OrderField3

OrderDateTime

OrderField4

 

OrderItems table

ItemID - Key

OrderID - FK

ItemField1

ItemField2

ItemField3

ItemField4

 

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="orders">
	<xs:complexType>
		<xs:sequence>
			<xs:element name="order_details" maxOccurs="unbounded">
				<xs:complexType>
					<xs:sequence>
						<xs:element name="orderfield1" type="xs:integer"/>
						<xs:element name="orderfield2" type="xs:string"/>
						<xs:element name="orderfield3" type="xs:string"/>
						<xs:element name="orderdate">
							<xs:complexType>
								<xs:sequence>
									<xs:element name="year" type="xs:integer"/>
									<xs:element name="month" type="xs:integer"/>
									<xs:element name="day" type="xs:integer"/>
									<xs:element name="hour" type="xs:integer"/>
									<xs:element name="minute" type="xs:integer"/>
									<xs:element name="second" type="xs:integer"/>
								</xs:sequence>
							</xs:complexType>
						</xs:element>
						<xs:element name="orderfield4">
							<xs:simpleType>
								<xs:restriction base="xs:string">
									<xs:enumeration value="option1"/>
									<xs:enumeration value="option2"/>
									<xs:enumeration value="option3"/>
								</xs:restriction>
							</xs:simpleType>
						</xs:element>
						<xs:element name="order_items" maxOccurs="unbounded">
							<xs:complexType>
								<xs:sequence>
									<xs:element name="itemfield1" type="xs:string"/>
									<xs:element name="itemfield2" type="xs:string"/>
									<xs:element name="itemfield3" type="xs:string"/>
									<xs:element name="itemfield4" type="xs:string"/>
								</xs:sequence>
							</xs:complexType>
						</xs:element>
					</xs:sequence>
				</xs:complexType>
			</xs:element>
		</xs:sequence>
	</xs:complexType>
</xs:element>
</xs:schema>

  • Administrators
Posted

If you can load the XML into a DataSet / DataTable then you could always take advantage of the System.Data.SqlClient.SqlBulkCopy class to make the actual inserting easier.

 

If the documents get too large (really depends on available RAM) then an alternate solution may be required.

 

Also if the existing XML already has a defined schema you can use that within VS - otherwise you will benefit from creating a schema / strongly typed DataSet within VS.

Posting Guidelines FAQ Post Formatting

 

Intellectuals solve problems; geniuses prevent them.

-- Albert Einstein

Posted (edited)

Thanks for your comments. I'll check out the SqlBulkCopy class.

 

I'd rather go the dataset way - it seems the correct approach and less work. I'm just not sure how to maintain the relationships between the tables. Should I derive a dataset from the tables or from the schema?

 

 

Also if the existing XML already has a defined schema you can use that within VS - otherwise you will benefit from creating a schema / strongly typed DataSet within VS.

 

Is this just a matter of taking the existing schema and adding codegen & msdata attributes to define the relationships etc?

 

What about adding ID / key fields? All the examples I've seen have a unique ID element in each record in the xml already.

 

Thanks again,

 

JM

Edited by jgm

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...