PHP - PDF to HTML

Using PHP to convert PDF to HTML

Zamzar offers a simple file conversion API to convert files from your applications with support for 100's of formats. Below we have listed an example to convert a PDF file to HTML using PHP. We also support a variety of other programming languages.

If you have any questions check out our comprehensive FAQ which contains further information on how to use the API.

PHP Code Sample

Our example code assumes that you're using PHP 5.3 or newer.

To convert your first file with the Zamzar API, send an HTTP request to POST https://sandbox.zamzar.com/v1/jobs containing your source file, and the your desired target format. If the source file is on the web or in S3, send us the URL: the source file doesn't need to hit your servers.

<?php

$endpoint = "https://sandbox.zamzar.com/v1/jobs";
$apiKey = "GiVUYsF4A8ssq93FR48H";
$sourceFile = "https://s3.amazonaws.com/zamzar-samples/sample.pdf";
$targetFormat = "HTML";

$postData = array(
  "source_file" => $sourceFile,
  "target_format" => $targetFormat
);

$ch = curl_init(); // Init curl
curl_setopt($ch, CURLOPT_URL, $endpoint); // API endpoint
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
curl_setopt($ch, CURLOPT_POSTFIELDS, $postData);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // Return response as a string
curl_setopt($ch, CURLOPT_USERPWD, $apiKey . ":"); // Set the API key as the basic auth username
$body = curl_exec($ch);
curl_close($ch);

$response = json_decode($body, true);

echo "Response:\n---------\n";
print_r($response);

Your source file is now being converted. Send an HTTP request to GET https://sandbox.zamzar.com/v1/jobs/$jobId to check its progress. The response will also give you details about your converted file.

<?php

$jobID = 15;
$endpoint = "https://sandbox.zamzar.com/v1/jobs/$jobID";
$apiKey = "GiVUYsF4A8ssq93FR48H";

$ch = curl_init(); // Init curl
curl_setopt($ch, CURLOPT_URL, $endpoint); // API endpoint
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // Return response as a string
curl_setopt($ch, CURLOPT_USERPWD, $apiKey . ":"); // Set the API key as the basic auth username
$body = curl_exec($ch);
curl_close($ch);

$job = json_decode($body, true);

echo "Job:\n----\n";
print_r($job);

Once the status of your job is successful, your converted file is ready to download. Send an HTTP request to GET https://sandbox.zamzar.com/v1/file/$fileId/content to download it. We store your files for a day by default, and for longer on our paid plans.

<?php

$fileID = 3;
$localFilename = "converted.html";;
$endpoint = "https://sandbox.zamzar.com/v1/files/$fileID/content";
$apiKey = "GiVUYsF4A8ssq93FR48H";

$ch = curl_init(); // Init curl
curl_setopt($ch, CURLOPT_URL, $endpoint); // API endpoint
curl_setopt($ch, CURLOPT_USERPWD, $apiKey . ":"); // Set the API key as the basic auth username
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);

$fh = fopen($localFilename, "wb");
curl_setopt($ch, CURLOPT_FILE, $fh);

$body = curl_exec($ch);
curl_close($ch);

echo "File downloaded\n";

If you like what you see and want to start converting files under your own API account then please click the "Get Started Now" button to signup for your own API account. Please feel free to get in touch with us should you have any specific questions or refer to our extensive docs and FAQ for further information.

Get Started Now

Why use Zamzar?

  • Hosting

    Don't worry about hosting and using your own servers we do this for you.

  • S3 Integration

    Automatically import and export to S3 with 2 lines of code.

  • Simple Pricing

    Fixed price monthly accounts which come bundled with conversion credits.

  • Great Support

    Our support team is staffed by software developers who will help to fix your problem.

  • Conversion Experts

    File conversion experts, having converted 350 million files over the past decade.

Using PDF with PHP

There are a multitude of different open source PDF manipulation libraries within PHP to choose from. FPDF is a library that allows you to generate PDF files with pure PHP with no need to use the PDFlib library. According to the license agreement referenced in the FAQ you can use FPDF with no restrictions in both a commercial and non-commercial environment.

Another open source PHP option is mPDF which generates PDF files from UTF-8 encoded HTML. It is based on FPDF and HTML2FPDF and was designed to be able to quickly create PDF files from a website template. It supports Right to Left languages as well as a host of other features that you can find listed here.

TCPDF is a great PHP PDF library for creating PDF files from as little as two lines of code. It doesn't require any external libraries for the basic creation of PDF files and is arguably the most extensive PDF PHP library that we have come across. You can see a full list of features here. It is actively maintained by Nicola Asuni in this Github repository.

If you don't want to use an open source offering you might want to consider using the Zamzar API. With a dedicated support team, code examples in many of the major languages including PHP, simple low cost conversion credits and support for direct import and export to S3, it may cover most of the use cases you require. Feel free to reach out to our support team with any questions or dip into the getting started guide in our docs.

Resources:
Related StackOverflow Questions:

Using HTML with PHP

There are a number of open source tools for creation of HTML files from a variety of different source file formats. Often the rendering of the outputted HTML file can be very heavyweight or difficult to edit, so it is worth reviewing the various different programs on offer to see which suits your needs. If you have a fairly simple source format such as a Microsoft Word document that is not too complex then you could consider using a tool such as Unoconv which converts between formats that OpenOffice supports and can output to HTML, the results however can be a little mixed if you have more complex documents.

Another option is to use another open source tool - Poppler. A number of developers have created PHP libraries that utilise Poppler for converting from PDF into HTML, so one option would be to convert the file into PDF then use a Poppler PHP library to convert that resulting PDF file into HTML - see PDF to HTML PHP Library using Poppler. Poppler is known to have some rendering issues, so it is certainly worth testing on a wide variety of files before deciding on this as a solution.

If you need a commercial solution with the benefit of the support that comes with that you could consider the Zamzar API. With a dedicated team on hand to help, code examples in many of the major languages including PHP, simple low cost conversion credits and support for direct import and export to S3, it may cover most of the use cases you require. Feel free to reach out to our support team with any questions or dip into the getting started guide in our docs. You can see a full list of supported input formats here.

Resources:
Related StackOverflow Questions: