Insights on Software Development and Architecture

ErionPC's weblog on software development

Tag Archives: ASP.NET

A SOA approach to automatized DOCX-PDF report generation – part 1


With the advent of Ms Office 2007 Open XML formats, the philosophy of Office report generation was deeply changed into making it dettached from Office itself and open to any kind of programming language which is capable of reading compressed archives and manipulating XML. For further reading visit

In this article I’m going to illustrate a SOA approach for generating Docx reports in a distributed environment with the necessity of having Ms Office 2007 installed only on the developer machine (not the production server). The application is composed by the following parts:

  1. An ASP.NET web application
  2. An IIS-hosted WCF service
  3. A business tier
  4. A data access tier
  5. A database

The scope of this article is limited to the top 2 tiers. By using the Open XML SDK (now 2.0) it’s possible to programmatically read and write inside Office Open XML packages – that means reading and writing Office files without using Office COM objects. This approach is very fast, easy, light on resources and STABLE. The WCF service in this application must be able to create Docx reports on the basis of an existing docx template and some db data serialized as XML. Docx files are constructed in a modular way. To be able to appreciate this, you can just rename a docx file changing it’s extension to .zip. To know more about how this archive is organized visit The part that we’re interested in is called Custom XML (read The approach that’s best to follow for manipulating data within a Docx file is binding content controls to custom xml parts.

1. Generating a docx template document

The first thing to do is to build a docx document which defines the layout of the reports by using Word 2007 or above. In this document there are going to be static parts (text-blocks, images and so on) and dynamic parts, which are going to be dependent on the data. At first we build and format the docx file as we expect it to look with dynamic data on it. Then, when we’re happy enough with the way it looks, it’s time to add the content controls. On the Word ribbon we need to go to the Developer tab (if you don’t see it, click here to learn how to activate it). In this tab we can find a few content controls, such as rich text, plain text, image, etc. We now need to replace the fake data that we’ve put into the document with the appropriate content controls.

2. Creating Custom XML parts

Using Word 2007 we’re able to put Content Controls into a docx document, but we’re not able to bind those controls to custom data. In order to do this we need to use another tool called Word 2007 Content Control Toolkit. At this point our docx document doesn’t contain any custom xml parts. We can create these by using WCCT. Open the docx document inside WCCT. On the right panel click on “Create a new Custom XML part”. The custom XML part will be created and we’ll be able to see it from the “Bind view” tab. On the left part of the window we will be able to see references to the content controls that we’ve inserted in the file. Clicking on the “Edit view” of the right panel it’s possible to edit the xml. The xml structure that we need to create has to be valid and needs to correspond to the content controls in the page. For example

<title alias=”Title”>document title</title>
<body alias=”Body”>document body</body>

When we’ve finished creating the xml, it’s always good to get the xml syntax checked by WCCT clicking on the “Check Syntax” button. We’re now ready to go back to the “Bind View”. We will now be able to see the xml nodes we’ve just inserted in a tree-like structure and the fun part is about to begin. We’ll now bind the xml nodes to the content controls, and this is as easy as drag-and-drop. Select one of the nodes on the right panel and drag it on the reference to one of the content controls of the document. Repeat this operation for all of the xml nodes until all the content controls have been bound to data. When you’re done, save the file and click on the preview button to open the document using Word. Notice how the custom xml data has replaced the text inside the content controls.

3. Building the WCF service

The WCF service will replace the custom xml inside the docx template with business logic xml data. Using the Open XML SDK this is actually very easy. Here’s the replaceXML method

private void replaceCustomXML(string docxTemplate, string customXML)
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(docxTemplate, true))
MainDocumentPart mainPart = wordDoc.MainDocumentPart;


//Add a new customXML part and then add content
CustomXmlPart customXmlPart = mainPart.AddCustomXmlPart(CustomXmlPartType.CustomXml);

//copy the XML into the new part…
using (StreamWriter ts = new StreamWriter(customXmlPart.GetStream()))
catch (Exception ex)
throw new FaultException(“Errore WCF!\r\n” + ex.Message);

Once the docx document is created it will be sent to the client as an array of bytes.

public byte[] GenerateDynamicDocx(string customXML)
HttpServerUtility webServer = HttpContext.Current.Server;

// Copy template.docx in the temp folder to preserve the original copy
string tempFolder = webServer.MapPath(“temp”);
string tempDocxFileName = Guid.NewGuid() + “.docx”;
string tempDocxFilePath = tempFolder + @”\” + tempDocxFileName;
File.Copy(webServer.MapPath(@”App_Data/template.docx”), tempDocxFilePath);

replaceCustomXML(tempDocxFilePath, customXML);

byte[] docxContents = File.ReadAllBytes(tempDocxFilePath);

//Delete the temporary file

return docxContents;
catch (Exception ex)
throw new FaultException(“Errore WCF!\r\n” + ex.Message);

4. Building the ASP.NET client

The ASP.NET client will have a template.xml file which replicates the structure of the custom XML part in the server’s docx template. Ideally, there would be a web page which automatically generates web controls for inputing data which mirrors the structure of the xml template file. After the data is inputed the web client must compose an xml document which follows the structure of the existing template.xml but replaces the data with those inputed by the user. The xml string is then sent to the WCF service wich returns the bytes of the docx file. These bytes can then either be saved as a docx file on the server or sent directly to the client through HTTP.

Click here to download source code.

Click here to view article on