How to Parse XML Document and Retrieve Rowset: SQL Server Syntax

SQL Server provides the sp_sml_preparedocument stored procedure to parse the XML document. This stored procedure reads the XML document and parses it with the MSXML parser. Parsing an XML document involves validating the XML data with the structure defined in the DTD or schema.

The parsed document is an internal tree representation of various nodes in the XML document, such as elements, attributes, text, and comments. Sp_xml_preparedocument returns a handle or pointer that can be used to access the newly created internal representation of the XML document. This handle is valid for the duration of the session or until the handle is invalidated by executing

Sp_xml_removedocument.

Retrieving a Rowset from the Tree

After verifying the accuracy of the structure and completeness of data, you need to extract the data from the available XML data. For this, you can use the openxml function to generate an in-memory rowset from the parsed data. The syntax of the openxml function is:

Openxml ( idoc int [ in] , rowpattern nvarchar [ in ] , [ flags byte [ in ] ] )
[ WITH ( SchemaDeclaration | TableName ) ]
Where,

Idoc specifies the document handle of the internal tree representation of an XML document.
Rowpattern specifies the XPath pattern used to identify the nodes (in the XML document whose handle is passed in the idoc parameter) to be processed as rows.
Flags indicates the mapping that should be used between the XML data and the relational rowset, and how the spill-over column should be filled. Flags is an optional parameter and can have the following values:
0 – to use the default mapping (attributes)
1 – to retrieve attribute values
2 – to retrieve element values
3 – to retrieve both attribute and element values
Schemadeclaration specifies the rowset schema declaration for the columns to be returned by using a combination of column names, data types, and patterns.
TableName specifies the table name that can be given, instead of SchemaDeclaration, if a table with desired schema already exists and no column patterns are required.