A SOA approach to dynamic DOCX-PDF report generation – Part 2

Introduction

Having already achieved automatized MsOffice-independent Docx report generation in a client-server architecture following the approach explained in my previous article “A SOA approach to dynamic DOCX-PDF report generation – part 1”, now we’ll look into automatically printing those docx files into PDF from managed code and transmitting the PDF bytes through HTTP.

The PDF conversion is based on a free BullZip PDF product, which offers a free, full-featured, programmable and very well documented PDF printer that can print any file to PDF, including Docx files.

Needless to say that PDF is probably the most used document exchange format between different platforms, therefore the need to have PDF reports of some kind of data is common to most data-centric applications.

1. Installing the PDF Printer

The first thing to do is to download and install BullZipPdf. It will create a PDF printer in the system and it will include the help file in the installation directory. Read through the help file to learn how to use the Bullzip.PdfWriter namespace.

2. Adding the PDF Conversion to an Existing Visual Studio Solution

First of all, we need to import the package into the solution. As sweet as it can be, we can find the package in the GAC, so just go on Add Reference -> .NET and find BullZip Pdf Writer. This will add the Bullzip.PDFWriter assembly to the solution, which exposes its classes and methods under the Bullzip.PDFWriter namespace. The next thing to do is configuring the PDF printer. This can be achieved through a .ini file, but I’m not going to enter into this, you can read a lot about it in the Bullzip documentation. The printer settings are managed by a class called PdfSettings, whilst the PDF creation methods are in a class called PdfUtils. Everything is ready now, we can already start converting to PDF!

3. Converting to PDF

Here’s what the test application does:

It includes some docx templates with sample data in a templates directory
Generates customized docx reports based on the docx templates and some XML-serialized Business-Logic data whose structure corresponds to the custom XML parts in the docx templates
Saves the docx reports into a temporary directory
Prints the docx reports into PDF
Sends the PDF bytes through HTTP
Destroys the docx and PDF files

This PrintToPdf method loads the printer settings from an “.ini” file, it “reads” a docx file from a temporary directory, creates the PDF file and then destroys the original docx and PDF.

using System;
using System.IO;
using System.Linq;
using System.Collections.Generic;
using System.Diagnostics;
using System.ComponentModel;
using System.Configuration;
using System.ServiceModel;
using Bullzip.PdfWriter;

namespace DocxGenerator.SL.WCF
{
    public class PdfMaker
    {
        internal static byte[] PrintToPdf(string appFolder, string tempDocxFileName)
        {
            try
            {
                string tempFolder = appFolder + @"\temp";
                string tempDocxFilePath = tempFolder + @"\" + tempDocxFileName;

                PdfSettings pdfSettings = new PdfSettings();
                pdfSettings.PrinterName = ConfigurationManager.AppSettings["PdfPrinter"];

                string settingsFile = pdfSettings.GetSettingsFilePath(PdfSettingsFileType.Settings);
                pdfSettings.LoadSettings(appFolder + @"\App_Data\printerSettings.ini");
                pdfSettings.SetValue("Output", tempFolder + @"\&lt;docname&gt;.pdf");
                pdfSettings.WriteSettings(settingsFile);

                PdfUtil.PrintFile(tempDocxFilePath, pdfSettings.PrinterName);
                string tempPdfFilePath = tempFolder + @"\Microsoft Word - " + tempDocxFileName + ".pdf";

                bool fileCreated = false;
                while (!fileCreated)
                {
                    fileCreated = PdfUtil.WaitForFile(tempPdfFilePath, 1000);
                }

                byte[] pdfBytes = File.ReadAllBytes(tempPdfFilePath);

                File.Delete(tempDocxFilePath);
                File.Delete(tempPdfFilePath);

                return pdfBytes;
            }
            catch (Exception ex)
            {
                throw new FaultException("WCF ERROR!\r\n" + ex.Message);
            }
        }
    }

Points of Interest

The scope of this article is limited to a mere illustration of what can be achieved through this architecture. With a little bit of head-scratching, you can extend this and make it into a PDF conversion server (did anyone think of a free version Adobe Distiller ???), a scheduled batch printer, an archiving system, etc.
If integrated in the SOA report generation solution mentioned above this permits you to get rid of the docx files and use PDF as the document exchange format.

Have fun!

History

The previous (must-read to understand the SOA integration concepts) article that brought to this: “A SOA approach to dynamic DOCX-PDF report generation – part 1”

Click here to view this article on CodeProject.

Click here to download the test application’s source code.

9 comments

in your topic docs to pdf conversion, i am trying to convert html to pdf
everthing works fine except when “PdfUtil.PrintFile” line gets executed, print dialogue box gets opened. is there any way to make this window invisible.

thanks in advace

erionpc says:

7 September 2011 at 23:0

The answer can be found in the printer’s configuration file (.ini). My example doesn’t show the print dialogue box and it’s not interactive – it just opens the docx document and prints it without the need of human interaction. Can you be a bit more specific regarding your issue?

Reply

thanks for your reply. The code works fine for .doc, .xls and .txt file. I have .htm or .html file. I would like to convert the .htm or .html file to pdf. using the above code.The .htm or .html file is getting opened in browser. but the opened file is neither getting closed not getting converted into PDF.
Please help.

Thanks in advance.

Can you give a code sample? What browser are you using? What happens if you put breakpoints and debug it? Does it throw any exceptions?

mandar kulkarni says:

8 September 2011 at 17:0

Entire code is copeid as it is given by you. just instead of doc file i am giving .html file as input file. i have tried this on both ie 6 and ie8. html file gets opened in ie browser. it goes into infinite loop @ following instance.
while (!fileCreated)
{
fileCreated = PdfUtil.WaitForFile(tempPdfFilePath, 1000);
}

Reply
- mandar kulkarni says:
  
  8 September 2011 at 18:0
  
  it doesn’t throw any exception.

I am using exactly the same code as given by you. Just the difference is instead of using .doc, i am using . html file. i have tried this on both ie 6 and ie 8 browsers. If I try to debug, code goes into infinite loop at following point:
while (!fileCreated)
{
fileCreated = PdfUtil.WaitForFile(tempPdfFilePath, 1000);
}
it doesn’t throw any exception. but everything works fine for .doc, .xls and .txt files

erionpc says:

9 September 2011 at 15:0

You got me curious :). I tested it myself and I confirm what you say. The funny thing is unless IE is set as system default browser, a Win32Exception: “No application is associated with the specified file for this operation” is thrown by the application (I’m developing in VS2008 in Win7 64b). If IE is set as default browser then the print dialogue gets opened, despite being disabled from the settings file. If you just press OK on it the print job is started and it works. It’s annoying that you need to intervene manually. There’s got to be a solution for this, but I haven’t got time to look for it, sorry. Try searching on “http://www.biopdf.com/guide/index.php”.

Good luck.

Reply

I tried searching on the link given by you. but no luck. all the settings given are to suppress the dialogue box with respect to bullzip. and not the print dialogue box. still i am looking for the solution. Please let me know if you find you from your busy schedule.

thanks in advance

Mandar says:

7 September 2011 at 21:0

in your topic docs to pdf conversion, i am trying to convert html to pdf
everthing works fine except when “PdfUtil.PrintFile” line gets executed, print dialogue box gets opened. is there any way to make this window invisible.

thanks in advace

- erionpc says:
  
  7 September 2011 at 23:0
  
  The answer can be found in the printer’s configuration file (.ini). My example doesn’t show the print dialogue box and it’s not interactive – it just opens the docx document and prints it without the need of human interaction. Can you be a bit more specific regarding your issue?
  
mandar kulkarni says:

8 September 2011 at 14:0

thanks for your reply. The code works fine for .doc, .xls and .txt file. I have .htm or .html file. I would like to convert the .htm or .html file to pdf. using the above code.The .htm or .html file is getting opened in browser. but the opened file is neither getting closed not getting converted into PDF.
Please help.

Thanks in advance.

erionpc says:

8 September 2011 at 14:0

Can you give a code sample? What browser are you using? What happens if you put breakpoints and debug it? Does it throw any exceptions?

- mandar kulkarni says:
  
  8 September 2011 at 17:0
  
  Entire code is copeid as it is given by you. just instead of doc file i am giving .html file as input file. i have tried this on both ie 6 and ie8. html file gets opened in ie browser. it goes into infinite loop @ following instance.
  while (!fileCreated)
  {
  fileCreated = PdfUtil.WaitForFile(tempPdfFilePath, 1000);
  }
  
  - mandar kulkarni says:
    
    8 September 2011 at 18:0
    
    it doesn’t throw any exception.
Mandar Kulkarni says:

9 September 2011 at 3:0

I am using exactly the same code as given by you. Just the difference is instead of using .doc, i am using . html file. i have tried this on both ie 6 and ie 8 browsers. If I try to debug, code goes into infinite loop at following point:
while (!fileCreated)
{
fileCreated = PdfUtil.WaitForFile(tempPdfFilePath, 1000);
}
it doesn’t throw any exception. but everything works fine for .doc, .xls and .txt files

- erionpc says:
  
  9 September 2011 at 15:0
  
  You got me curious :). I tested it myself and I confirm what you say. The funny thing is unless IE is set as system default browser, a Win32Exception: “No application is associated with the specified file for this operation” is thrown by the application (I’m developing in VS2008 in Win7 64b). If IE is set as default browser then the print dialogue gets opened, despite being disabled from the settings file. If you just press OK on it the print job is started and it works. It’s annoying that you need to intervene manually. There’s got to be a solution for this, but I haven’t got time to look for it, sorry. Try searching on “http://www.biopdf.com/guide/index.php”.
  
  Good luck.
  
Mandar Kulkarni says:

15 September 2011 at 11:0

I tried searching on the link given by you. but no luck. all the settings given are to suppress the dialogue box with respect to bullzip. and not the print dialogue box. still i am looking for the solution. Please let me know if you find you from your busy schedule.

thanks in advance

Insights on Software Development and Architecture

ErionPC's weblog on software development

A SOA approach to dynamic DOCX-PDF report generation – Part 2

Introduction

1. Installing the PDF Printer

2. Adding the PDF Conversion to an Existing Visual Studio Solution

3. Converting to PDF

Points of Interest

History

9 comments

Leave a reply to Mandar Kulkarni Cancel reply

A SOA approach to dynamic DOCX-PDF report generation – Part 2

Introduction

1. Installing the PDF Printer

2. Adding the PDF Conversion to an Existing Visual Studio Solution

3. Converting to PDF

Points of Interest

History

Share this:

Related

9 comments

Leave a reply to Mandar Kulkarni Cancel reply