Jump to content
Xtreme .Net Talk

Recommended Posts

Posted

I'm trying to pull some data from a Microsoft Word XML Document, but the code I have generates sloppy output (see streamLog.txt). I am trying to write each row of tab delimited data on its own line, but I never appends a carriage return.

 

My code consists of 2 methods: readDocument and getXmlTables. The readDocument routine works - it is just to show you how I open the file (if anyone needs it). The problem comes from getXmlTables not putting rows on separate lines:

 

private void readDocument()
{
  XmlDocument myXMLDocument = new XmlDocument("myDoc.xml");
  const string wordDocXml = "/word/document.xml";
  string pkgNs = myXMLDocument.DocumentElement.Attributes["xmlns:pkg"].Value;
  XPathNavigator nav = myXMLDocument.CreateNavigator();
  if ((nav.MoveToFollowing("part", pkgNs)) && (nav.MoveToAttribute("name", pkgNs)))
     while (nav.Value != wordDocXml)
        if (!(nav.MoveToParent()) || !(nav.MoveToFollowing("part", pkgNs)) || !(nav.MoveToAttribute("name", pkgNs)))
           throw new InvalidOperationException("Unexpected file format. Missing [pkg] namespace.");
  if (nav.Value == wordDocXml)
  {
     string path = @"C:\Temp";
     if (Directory.Exists(path))
     {
        FileInfo file = new FileInfo(Path.Combine(path, "streamLog.txt"));
        using (StreamWriter sw = new StreamWriter(file.FullName, false, Encoding.UTF8))
        {
           getXmlTables(nav, sw);
           sw.Close();
        }
     }
  }
}

void getXmlTables(XPathNavigator nav, StreamWriter stream)
{
  string wNs = "[url]http://schemas.openxmlformats.org/wordprocessingml/2006/main[/url]";
  while (nav.MoveToFollowing("tbl", wNs))
  {
     while (nav.MoveToFollowing("tr", wNs))
     {
        while (nav.MoveToFollowing("t", wNs))
        {
           stream.Write(string.Format("{0}\t", nav.Value));
        }
        stream.WriteLine();
     }
     stream.WriteLine();
  }
  stream.WriteLine();
}

 

Forgive this question that seems simple, but I can't find an answer.

 

The attached zip file contains an XML document exported by Microsoft Word that I am trying to parse along with the log that it creates.

 

I've looked at several examples on how to use XPathNavigator, but there does not seem to be detailed information that I can grasp the concepts of.

 

Is anyone on here good at using XPathNavigator? I'd really like some advice from someone that is good at using XPathNavigator.

 

Regards,

Joe

myDoc.zip

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...