There are a few standard structured data formats and discussions galore on which of them is more advantageous. Within Xplenty, users are able to process JSON and XML data formats with ease, and this article shares an example showing the functions that facilitate processing XML on Xplenty. 

Integrate Your Data Today!

Try Xplenty free for 14 days. No credit card required.

Octopus

Table of Contents:

  1. Overview and Resources
  2. Setting Up the Xplenty Data Pipeline
  3. Summary

Overview and Resources

For a demonstration, here is the link for the sample XML file we will be processing https://docs.microsoft.com/en-us/previous-versions/windows/desktop/ms762271(v=vs.85)

The file shows XML structure as in the image below:

XML-source.png

The Xplenty functions XPath and XPathToBag are key to the processing of this data. Let's examine these with a data pipeline.

Setting up the Xplenty Data Pipeline

XML-processing-pipeline.png

The following list explains the different components of the Xplenty pipeline in the order:

1. XML_Source: The XML file from the link shared above is copied onto a cloud storage location and read using the File Storage Source Component

2. XPathToBag: This step calls the XPathToBag function to match the XPath '/catalog/book'. This fetches all the books under <catalog> </catalog> in a Bag datatype. For example, XPathToBag(data,'/catalog/book')

3. Flatten_Books: Uses the Flatten() function to get the books as individual records each record of the structure as 

            Part-of-XML-source.png

4. XPath: In this step using the XPath function, the individual elements of the book structure can be retrieved. Here is a peek into the component with the XPath set up for the above <book> </book> structure

        XML-processing-XPath.png

For additional reference on XPath and examples, refer to an XPath evaluator such as freeformatter.com

5. Destination: The individual fields processed from the XML are stored in a destination, in this example, it is a BigQuery table.

The following image depicts some example records from the output:

XML-processing-destination.png

Parsing the XML from a file or an API response into a tabular structure would be key for having data lookup, and blending with other datasets could facilitate further data analysis.

Integrate Your Data Today!

Try Xplenty free for 14 days. No credit card required.

Octopus

Summary

There are several enterprise systems that consume and output XML data, and as a trusted document-based information transfer, XML based files and APIs can come up often as use cases. Stop by and explore the functionality for processing the structured data formats on Xplenty. For more individualized instruction and information, contact us to book a risk-free demo.