American Printer's mission is to be the most reliable and authoritative source of information on integrating tomorrow's technology with today's management.

The best PDFs

Aug 1, 2006 12:00 AM


         Subscribe in NewsGator Online   Subscribe in Bloglines

What makes a PDF file print-perfect? Ideally, the application used to create the original layout should be true design software, such as InDesign or QuarkXPress. All too often, content creators use a product like Microsoft Word as a layout application. This is like baking a “mock” apple pie with Ritz Crackers and expecting it to taste exactly like a pie made with real apples. Word is not a graphic design application; it is a word processor! As anyone who has submitted (or received) a Word document for print output knows, the text in the document has a good chance of reflowing once the document is opened.

Microsoft Word obtains font metrics from the operating system based on the resolution/characteristics of the target output device, not from the device- and resolution-independent font metrics. So when you create a Word document with your desktop printer set up as the default, the text in the file is set up to output at that printer's resolution, usually 600 dpi. If you change the target output device to another printer with a different resolution (like your 2,400-dpi CTP engine), or even to the Adobe PDF/Distiller virtual printer, and the text flow is very likely to change, as well.

Content creators can avoid this problem simply by setting Distiller/Adobe PDF as the default printer on their computers. Every subsequent new Word document will be set up to Distiller's metrics. When a PDF file is created, the text will not change. “Won't the text reflow when I try to print the file to my printer?” you might wonder. Not if you open the PDF file in Adobe Acrobat or Reader and print to the target printer from there! When it ships, Word 12 will allow users to directly export PDF files without going through a printer driver — a feature that should eventually alleviate some current headaches.

What every PDF should have
Simply using the proper design-oriented software doesn't guarantee print-perfect PDF files. Bad PDF files have been created from every application that can write PostScript or export PDF.

Although there is no such thing as a typical print project, there are characteristics common to a print-viable PDF file. These include:

  • All fonts used are embedded (fully or subset to include only glyphs used).
  • All included bitmap images are of sufficient resolution for the final print method.
  • If compression is used for images, it is lossless (zip) or highest-quality JPEG.
  • Illustrations are encoded as vector data: no erroneous conversion to bitmaps.
  • Colors are specified in the correct color space (as intended to print).
  • Physical dimensions of page size are correct and sufficient to include bleed objects.
  • There is a plan for how, when and where to flatten live transparent objects.
Don't forget the fonts!
Because a font must reside on the computer from which the file is being printed, most output providers ask content creators to send fonts along with a job for print output. While font foundries have different rules for the exchange of their products (based on the license agreement that we usually glibly “agree” to so we can install the font), the typical agreement allows fonts to be sent along with a job for print output, but the output provider also must have a license to use that font. This can be a problem if the designer uses an obscure font. In PDF this shouldn't be an issue, because most foundries, Adobe included, allow font embedding for the purpose of print and preview, and the printer does not have to own a license for the font to print the job.

So why are there still so many font-related problems? While most default PDF creation methods are set to embed fonts, it still is easy to inadvertently omit a font from the final PDF file. Consider Distiller, Adobe's flagship PDF creation tool. Distiller ships with several canned “PDF settings” (a.k.a. job options), but it is easy to create or edit new settings. At the application level, fonts can be included in the PostScript that is sent to Distiller or not, depending on the user-selected print settings. On the Windows platform, when a PDF file is created through an Adobe PostScript printer driver, TrueType fonts can be handled in one of five different ways, depending on the option selected in the “send fonts as” or “TrueType download” dialog (where this dialog can be found varies depending on the version of Windows used.) A TrueType font can be converted to the Type 1 equivalent, sent as Type 42 (native TrueType), converted to outlines, converted to bitmaps, or not included at all. If Distiller can't access the fonts through the PostScript file, it will look at the available fonts on the system. If it cannot find them there, and the font settings are set to something other than “cancel” the job if fonts cannot be embedded, then the PDF file will be created without them. Font embedding is user selectable from the print or PDF export settings in QuarkXPress, but InDesign and Illustrator force inclusion of fonts for PDF export.

Even if the PDF creation tool is set to embed them, OpenType and TrueType fonts can be restricted to disallow embedding of any kind. While very few modern font foundries would offer such fonts, fonts acquired in the late 1990s, such as those packaged with CorelDraw, might include this restriction. Some applications will not allow the user to place restricted fonts in a layout (the Mac version of QuarkXpress 6.5 and 7, for example). Others (such as InDesign CS) will warn that the font cannot be embedded in a PDF file. Still others will offer no warning at all that a restricted font is being used.

Using the “Document Properties” option available in Adobe Acrobat or Reader lets users confirm a font is embedded in a PDF file. If a font is listed as “embedded,” all of the glyphs that are part of that particular font are included in the PDF file. If it is listed as “embedded subset,” only the glyphs used in the document are included. If fonts are not embedded, there are two options for font preview and print. Acrobat or Reader can use the Adobe Multiple Master font that installs with the application to simulate the missing font for preview and print. This substituted font seldom matches the original, but at least it allows the viewer to read the text. If the called-for font is available on the local computer, it will be used for preview and print from Acrobat products. This doesn't help if the PDF file is not being printed directly from Acrobat, however. In this case, the local font can be embedded into the PDF file using the touch-up text tool or with a third-party editing tool like Enfocus Pitstop Professional.

Glyph glitches to avoid.
The only thing worse than a missing font is a missing or substituted individual character. Have you ever seen a character replaced by a rectangular box or an entirely different glyph than the one called for in a PDF file, even when the font is embedded? Even worse, the glyph might be correct when the file is viewed on screen, but substituted with another glyph in the output file. How does this happen?

Typically this results from differences in font types and font encoding, combined with differences in computer platforms. PDF files can contain three of the most common type of font formats: PostScript Type 1, TrueType and OpenType. When it comes to Latin-based languages (i.e., English) there are less than 128 commonly-used characters, which neatly fit into seven of the eight bits available in an ASCII format file. Most operating systems use the same character set for these 128 “low ASCII” characters, so any common character (a lower case “a,” for example) will be the same no matter which application or OS writes out the file. The remaining 128 characters (that eighth bit), which generally are less common characters, are not necessarily encoded the same way on the Mac and Windows platform, and might be different depending upon the application writing or printing the file. This is the cause of much substitution of nonstandard characters like ligatures and fractions.

When a font is embedded in a PDF file, it should not matter that the file is printed from a computer running a different operating system than the one on which the PDF file was created; the embedded font is to be used for both preview and print. Font substitution at the RIP can happen, however, if a font of the same name but with different characteristics is resident at the RIP. (Adobe representatives have been heard to claim this will not happen on Adobe-based RIPs, though many in-the-trenches users report glyph substitution on output.) Subsetting fonts, in addition to making a PDF file smaller by only including the glyphs actually used in the layout, also renames fonts in the PDF file. So, if Helvetica is used in a document and the PDF file is created with font inclusion set to subset embedding, Helvetica will be included as something like EFGWXK+Helvetica. Theoretically, this means the RIP will not recognize that font and, therefore, will not use the RIP resident version of it for print. In prepress environments, fonts never should be resident on the RIP, especially for general commercial printers.

Subset fonts in PDF files are not without problems, either. Merging multiple PDF files with subset fonts can, on rare occasions, result in missing characters in the merged PDF file. Subsetting a font will rename it, as shown above, with a random generation of six alpha characters plus the base font name. Unfortunately, Acrobat will use the first version of a particular font name that it detects, creating a problem when two PDF files that happen to have the same six-character prefix are merged within Acrobat — only the glyphs used in the first merged file will be available for all of the other sections of the file. So if a new glyph is used in the subsequent PDF file merged into the first, it will not display or print, and a blank space will appear instead.

When it comes to editing text in a PDF file, Acrobat's built-in touch-up text tool is rudimentary, at best. The font must be loaded and available on the computer where the editing is taking place, making editing of a PDF file created on the Mac OS via a Windows PC virtually impossible. Conversely, Mac OS X offers the ability to load and use Windows TrueType as well as OpenType fonts. This is not to say it's easier to work with fonts on the Mac platform. In Windows 2000 and XP, fonts are found in only one location. On the Mac, as any user knows, fonts can be found just about anywhere and it's very easy to have duplicate fonts. To make matters worse, OS X includes Apple's new dfonts, which are simply the Mac version of the most commonly used fonts, like Helvetica and Times. When it comes to PDF creation, this variability can result in the wrong version of a font being embedded in a PDF file with the possible outcome of different spacing, kerning or even inclusion of a glyph that is entirely different from the one originally set in the page layout.

Put down those pixels
When creating a high-resolution graphic for print output, the rule of thumb is to allow two image pixels per halftone dot on the press. So, for a project that is to print on an offset press at 150 lines per inch (lpi), a pixel-based image optimally should contain 300 pixels per linear inch (ppi) (a.k.a. dots per inch or dpi) of effective resolution. More resolution than that is unnecessary and only will slow the RIP process. For that reason, most PDF creation tools include a down sampling option. Unfortunately, in an attempt to make the file smaller, a user might toss out too much pixel data.

Sometimes this down sampling happens unintentionally, as when the “standard” Distiller job option setting supplied with the base Distiller application is used. This standard setting, by default, sets image down sampling to 150 dpi, roughly half of the optimal amount for the average offset print job. Tools that can be used to repurpose PDF files, such as the optimizer built into Acrobat 6 or 7, also can be set to down sample image data, which is fine for making soft proof versions of PDF files but not so good if those files with lower resolution images then are used for high-resolution printing. Photoshop and other applications (like Genuine Fractals) that offer sampling algorithms notwithstanding, once you strip pixels out of an image and make it a lower resolution file, there is no way to make it look as good as it did originally.

DCS + .qxd = Arghh!
Some PDF creation methods can convert a perfectly well made vector-based Encapsulated PostScript (EPS) graphic into a low-resolution bitmap image in the resultant PDF file. The PDFWriter printer driver, a tool no longer bundled with Adobe Acrobat products but still in circulation, writes PDF files in which EPS graphics are converted into low-res bitmaps. The best solution is to not use PDFWriter at all, especially if any EPS graphics are included in the original layout.

The Quartz-based “save as PDF” option that is available from every print dialog window in Mac OS X does the same thing to placed EPS graphics. In addition to converting vector-based EPS graphics to low-resolution bitmaps, the “save as PDF” option also converts the color space of those graphics to RGB. It is possible to tweak a Quartz filter through the ColorSync Utility to control some aspects of PDF creation via the “save as PDF” method, including converting RGB color back into CMYK. Duotones (Device N) and spot colors, however, will be lost and included only as RGB (or converted to the CMYK) colors — not as the original specified spot colors. While the “save as PDF” method is easy, once again, it is not a good way to create PDF files for print production.

A PDF file also can lose high-resolution image data when a Desktop Color Separation (DCS) file is placed into certain layout applications. DCS files are preseparated and include a channel for each color, process and/or spot, along with a low-resolution placeholder image. The high-res data from DCS image files placed in QuarkXPress (any version) will not be included in a resultant PDF file; rather, the low-resolution display image will be included. The same is true of Illustrator CS(2). From Word, we've seen just the additional channel's data included in a PDF file, and not the process color channels at all. PDF files exported from InDesign CS(2), on the other hand, will include the high-resolution data from placed DCS images. So if you must use the DCS format, build the page in InDesign.

EPS can prevent Word RGB woes
Here's another common PDF woe: Bitmap and vector objects aren't in the proper color space for output — specifically, color is in the RGB space, rather than process CMYK or spot. For color-managed workflows, this might be intentional, as ICC-tagged RGB images can be converted to CMYK in the RIP. Conversion to RGB, however, is a side effect of any number of PDF creation methods. As we already mentioned, using the “save as PDF” option from any print dialog on Mac OS X will result in all color converting to RGB, unless a special Quartz filter is used to convert color back to CMYK.

Microsoft Word is the most well-known RGB converting culprit. Color in Word is based on the RGB-based graphics model of the operating system, which is GDI under Windows. When a file is printed from Word (or any Office application), it uses the PostScript driver to generate color, which will be in the RGB color space, including text that should be black.

To avoid this RGB conversion from Word, include all graphics saved in the EPS format. The driver will not convert any color in EPS graphics to RGB. Rather, it will pass through unaltered — so process and spot color will remain as intended in a PDF file. To prevent black text from being included in a PDF file in the RGB color space, use the Adobe PDF driver that is installed along with Adobe Acrobat or Reader.

Commonly, graphics applications convert color as a part of the printing or PDF creation process. InDesign and QuarkXPress can convert RGB to CMYK on output, depending upon the selected output settings. Adobe CS2 includes what the company refers to as a “safe CMYK workflow.” This allows RGB objects to convert to CMYK while leaving placed CMYK objects alone, preventing any additional color alteration of CMYK objects. When selecting options for CMYK conversion in the print or export PDF dialog boxes, select “preserve CMYK numbers” to cause RGB images to convert to CMYK based on the profile selected while leaving CMYK data untouched.

In QuarkXPress, conversion of RGB images to CMYK happens automatically when the print colors option is set to “composite CMYK.” Depending on whether color management has been enabled, this can result in terrible or not-quite-terrible conversion of RGB images to CMYK. Device N colors also can be converted to a process equivalent during the PDF creation process, especially in PDF files created from QuarkXPress 6.x or lower. Device N colors include objects that are a mix of colors from more than one color space, like a spot-to-spot blend or a grayscale TIFF image assigned a spot color to make a “fake duotone.” Prior to QuarkXPress 6, you had to have a special Xtension from Agfa, Creo or another vendor to prevent spot-colorized TIFF images from converting to process (although placed duotones saved in the EPS format pass through unchanged). QuarkXPress 6.x includes the Device N Print Color option, rendering those Xtensions unnecessary. In QuarkXPress 7, the Device N moniker was discarded in favor of the more comprehensive “process CMYK and spot” color option setup. When you export PDF files or print PostScript to Distiller, use Device N instead of “composite CMYK,” and colorized TIFF images or spot blends will come through into the PDF file as expected.

Device RGB color can be converted to process easily using the “convert colors” option in Acrobat 7 Professional. Found on the prepress production toolbar, the option will convert color based on an ICC profile-based color space, and it can be set to convert “RGB black” text to a solid black instead of a process build. There are a number of third-party plug-ins that offer more extensive color conversion options, including Enfocus Pitstop Professional (www.enfocus.com), callas Software's pdfColorConvert (www.callassoftware.com), Quite Software's Quite a Box of Tricks (www.quite.com), and ARTS PDF's Crackerjack (www.artspdf.com).

Bleed and page size issues
Most graphics applications include an option to set a bleed amount in the PDF Export or Print dialog boxes. This is one area where the Export PDF method trumps the PostScript via Print Dialog method, as this is the only setting required to ensure that the PDF is created at the correct page size, including bleed. Creating a PDF by printing PostScript through the Print dialog requires the second step of selecting the correct printer driver and entering a physical page size that is large enough to accommodate the trim, bleed, and registration or slug information. This will set the media box for the resulting PDF file.

If the media box is set too small and bleed is cut off of the PDF file, it can be enlarged to accommodate bleed by changing the custom page size in Acrobat 7's “crop pages” dialog box, or with an Enfocus Pitstop Pro option. If there are bleed objects on the page, expanding the media box will reveal them. If no bleed amount was set within the original application, however, there will be nothing in the PDF to work with, so the bleed objects will have to be cloned or edited to fill the expanded page size — a daunting, but doable, task.

Bake someone happy
If only content creators could simply follow a single recipe from an all-purpose PDF cookbook to crank out batch after batch of perfect PDF files. Unfortunately, we're dealing with too many different ingredients — but some simple rules apply to just about every process:

  • Ensure there is sufficient image resolution in the original images and don't remove too much through down sampling.
  • Use fonts that can be embedded and don't forget to embed them.
  • Don't use a PDF-creation method that converts color unnecessarily, and have a plan for transparency flattening.
If you and your clients follow these general rules, soon everyone should be cooking up great, print-ready PDF files.

Julie Shaffer is the director of the PIA/GATF Center for Imaging Excellence (www.gain.org). Contact her at jshaffer@piagatf.org



Three simple rules

  • Ensure there is sufficient image resolution in the original images and don't remove too much through down sampling.
  • Use fonts that can be embedded and don't forget to embed them.
  • Don't use a PDF-creation method that converts color unnecessarily, and have a plan for transparency flattening.


Flattening transparent objects
The trouble with transparency is that PostScript doesn't understand it. Transparent objects have to be “flattened” prior to output to any kind of PostScript-based printing device. Where this flattening should happen, and by whom, is a key issue.

In Adobe CS2 applications, a graphic object is a source of transparency if any of the following applies:

  • It has an opacity of less than 100 percent or an opacity mask (Illustrator).
  • It has any blending mode other than Normal.
  • It has a drop shadow or feather.
  • It has a inner glow or outer glow effect (Illustrator).
  • Its fill or stroke has a style, brush, pattern or filter effect that has any of the previous properties.
  • It is a placed Photoshop file (native, PDF or TIFF) with a transparent background.
  • It is a placed Illustrator file (native or PDF) that contains one or more objects with any of the previous properties.
QuarkXPress 7 also includes the ability to create drop shadows or otherwise add transparency to objects, but because all output from Quark is PostScript-based (even PDF export), transparent objects have to be flattened to a specific bitmap resolution or removed when the PDF file is created. In Adobe CS2 and Acrobat 7, the transparency flattener will rasterize or outline objects, based on the types of objects being flattened, the file's complexity and the settings that are in effect when the flattening takes place. Three default settings are provided by Adobe, “high resolution” being the best option to maintain as much vector data as possible, although it is easy to create custom flattener settings to do things like convert everything to bitmaps or convert all text to outlines.

Flat and unhappy
Transparency flattening can lead to many problems. Any text that touches a transparent object can be rasterized during the flattening process. In most RIPs, text is imaged at a higher resolution than raster objects (usually at the highest resolution of the output device). Rasterized text is especially unsightly when just a portion of a line is converted to bitmaps, while the rest of the line remains text and is imaged at a higher resolution.

Text also can be converted to outlines by the flattener, which can result in a thickening of the stroke making up the outline of each letter. If the user chooses the option to outline all text, every text object on the page will be outlined, whether or not it touches a transparent object. Even if this option is not selected, the flattener might outline the text that is involved with transparent objects anyway, making for a possible unsightly difference between the outlined characters and the rest of the normal text on the page.

Flattening also can break all objects involved in transparency into “atomic regions.” Suppose there is a spot-colored vector object with a blend mode of “multiply” sitting on top of a bitmap image with a drop shadow beneath it, on top of a background of another spot color. When the vector object and the drop shadow are flattened, all of the objects involved likely will be broken into new objects or atomic regions that might be vector- or bitmap-based, depending on what it takes to render the transparency into something a PostScript device can digest.

For example, the flattener might use overprinting commands (supported in PostScript) to create transparent effects using opaque objects, especially when flattening a transparency that interacts with a spot color. In fact, in order to print or view a PDF file containing flattened transparent objects correctly, it is necessary to use software that supports overprinting. Without overprinting support, overprint objects will knock out whatever is beneath them. If you've ever used Acrobat or Reader to preview a PDF file and instead of seeing a soft drop shadow behind an image, you see a white box around a harsh black box instead, it probably was because overprint preview wasn't enabled. The same is true at print time — if the device doesn't support overprinting, objects that have been flattened and require overprint capabilities to render properly will knock out.

Some users report seeing white, hairline-like artifacts at the edges of some atomic regions when they view a flattened PDF file in Acrobat or Reader. Turning off the “smooth line art” and “smooth images” option in the page display area of Acrobat's general preferences will cause the preview of these lines to disappear. It is important for prepress operators to see these, however, because they can predict where similar lines can appear in the actual printed piece when it's output through certain systems. It seems that, in some cases, InRIP trapping will create a “choke” or pull-back trap between the atomic regions, creating an actual gap between them. While turning off InRIP trapping typically eliminates this problem, you've then removed all trapping from the file. Note that it might be necessary to disable only “image to image” trapping, to eliminate this problem while still allowing the rest of the job to trap as it should.



Save the vector data
Optimally, transparency settings should be set to retain as much vector data as possible, because rasterization converts objects into uneditable fixed-resolution bitmaps and makes the file larger. Opening a PDF file with transparency into Photoshop will convert the entire file into a single, large bitmap image. Conversely, using the flattener in InDesign or Acrobat and setting the raster — vector slider all the way to the raster side will not rasterize the file into a single large image. Instead, it will break it into a bunch of atomic regions, the size and number of which will vary depending on the page contents.

A brighter day is coming. In time, the transparency problem will fade away as the industry gradually shifts from PostScript-based RIPs in favor of new platforms, such as Adobe's PDF Print Engine, that can render PDF files natively.