American Printer's mission is to be the most reliable and authoritative source of information on integrating tomorrow's technology with today's management.
May 1, 2005 12:00 AM
Like a cartoon elephant convulsing upon seeing a mouse, many
printing companies panic at the prospect of accepting a
customer’s database file. The same companies might routinely
use layered Adobe PhotoShop documents or complex Macromedia
Freehand illustrations, but quake in their boots when confronted
with a few thousand mailing addresses in FoxPro.
Unfortunately, some fraidy cats are missing out on everything from simple addressing jobs to profitable personalized marketing campaigns.
Columns, rows, records and fields (oh my!)
Database refers to any method for organizing and storing information—from a handwritten list in a three-ring binder to the complex infrastructure of management information systems. A discrete collection of information (such as a list of employees and their addresses) is referred to as a data file. This information can be accessed using two common tools: spreadsheets and database programs.
A spreadsheet program (such as Microsoft’s Excel), displays information as a table made up of columns and rows. Each vertical column is intended to store a single type of data (such as "First Name" or "State"), while each row contains all the data accumulated for a single person or subject. These horizontal collections of data describing a single entity are known as records, while the vertical columns of homogenous entries are called fields.
Nearly every data file is based on this table concept and stored within a proprietary format developed by a software manufacturer. Computers running MS Windows, UNIX or Mac OS X identify these formats through a unique filename extension, such as .xls for Excel or .dbf for IBM’s DBASE. In addition to storing data in its own internal format, most database programs also can export data to your choice of several proprietary formats.
Exporting data as plain text is another common feature—each record is separated by a return; individual records are separated ("delimited") by a specific character. Popular text formats for data files include the "comma separated values" format (.csv) and the "tab-delimited" format (.tab or .txt). Some older mainframe computers only can export data as a text file in which each record occupies a consistent number of characters; each field’s data is followed by enough spaces to reach the proper character count.
The XML factor
It seems likely more database systems eventually will move beyond the current table approach to the eXtensible Markup Language (XML) format. Using XML tags to identify the nature of each entry rather than its placement within a particular field and record could improve efficiency of automated processes, including language and currency translation. Furthermore, because XML is an open standard, XML database users as well as content management systems might find it easier to change vendors as their data management needs expand.
Flat field databases only can access one table of data at a time. Relational databases, as shown above, can access multiple tables simultaneously.
Database dilemma: flat field or relational?
Whether you store your data files as XML or in a more traditional format, you’ll likely choose between the two most popular categories of database programs: flat field or relational. Flat field (or non-relational) databases only can access one table of data at a time. There are dozens of flat field programs, most of which are simple list managers intended to create mailing labels or generate a searchable index for your CD collection.
By contrast, a relational database can access multiple tables
simultaneously. Of course, nothing prevents you from using a
relational database management system (RDBMS) to work with only a
Despite their ability to link multiple tables, most inexpensive relational databases (such as FileMaker Pro or Microsoft Access) typically are used with only a single data file.
Multiple tables or just one big one?
Having opted for a real database program rather than a spreadsheet, another decision lurks: Should you use multiple tables for your project, or keep it simple by compiling all your data into a single table? The answer depends—not surprisingly—upon the task’s complexity.
If you’re just addressing some envelopes or performing a mail merge on some letters, one flat-data file will suffice. Simply accept the customer’s spreadsheet or tab-delimited text file, check for any missing or corrupted data, sort by ZIP code (for a bulk-rate discount) and imprint your documents. You can use any program that can accept a flat file as input—this includes everything from Microsoft Word to sophisticated software for variable-data printing (VDP). (See "Variable-data software update," January 2004.)
Multiple tables linked via an RDBMS should be used for "one-to-many" situations. Consider the classic VDP example, the car dealership campaign. A single flat-data file can tell you where each potential customer lives and even about the most recent purchases from that dealership, but what about any previous transactions? Using two tables—one for contact information and a second to track business activity—lets you achieve a higher level of personalization. You can build a chummy campaign for the customer who’s purchased her last three cars from your dealership, but deliver a more conciliatory message to the poor guy who bought a car that incurred $4,000 in repair costs.
Suppose your corporate customers are reluctant to give you their records for a VDP campaign? Or, what if they want an ongoing campaign that constantly generates new orders based on the latest additions to the database? You’ll need to "pull" data from a running database, a practice known as a query. Database manufacturers have standardized the format for these requests using the widely supported Structured Query Language (SQL). The SQL query format allows an end user to ask for information based on certain criteria in (somewhat) plain English. High-end database manufacturers have extended SQL with additional terms and functions for more powerful searches.
In most cases, SQL searches facilitate data extraction—but beware the curse of the ancient mainframe! These dusty relics of the pre-DOS days might have significant limitations that require extra attention—data might be stored as all capital letters, punctuation might be non-existent and fields might be truncated after just 10 characters. Marketing is a battle for hearts and minds; don’t underestimate the power of mainframe data to offend your audience. A letter addressed to "H HINDERLITE," for example, will likely end up in the trash.
Cleanliness is next to postal accuracy
Once you’ve obtained the data for your mass mailing or variable-data project, evaluate the records. Common "dirty data" problems include duplicate records, incorrect data, mismatched fields and nonconformance to postal requirements. Regardless of how careful you are, expect some problems. Your customer might have merged several data files together without realizing it created hundreds of duplicate records. Or perhaps the salesperson input prospects’ first and last names into the "Last Name" field.
Cleaning your data before starting the print run isn’t just a good idea, it’s a U.S. Postal Service (USPS) requirement. To qualify for the best bulk postage rates, you’ll need to submit a "USPS Proof of Accuracy" report with your printed pieces. Coding Accuracy Support System (CASS) software automatically generates these reports.
All CASS-certified data cleansing tools share some basic goals:
The best defense is a good offense
The best approach is to prevent errors from the start. Fortunately, most databases offer a variety of validation rules (good for enforcing "CA" instead of "Calif") and input filters to avoid entering the wrong type of data (useful for mandating a five- or nine-digit ZIP code). In other cases, it makes sense to preprogram a drop-down list of choices (such as a predefined set of business categories) rather than expect everyone to input exactly the same response.
For an industry that loves spectrophotometers, platesetters and automated presses, our collective fear of numbers is unwarranted. Overcome your phobias and you’ll be able to augment those long print runs with short-run, high-margin VDP jobs. Whether you’re cleaning up your own customer list or producing millions of variable-data brochures, the key to success could be mastering your database.
What’s the difference between a database and
Microsoft Excel often is incorrectly identified as a flat-field database program. While spreadsheet programs like this create data files that can be imported into a database program, key differences include the following:
Spreadsheets don’t use an index to enhance data searches, in large part because they load the values for every cell into RAM memory for fast access. This limits the amount of information that can be stored—65,000 rows, in MS Excel. Also, unsaved data might be lost during a power outage or system crash. (Databases typically write to disk after every new operation.)
The best things in life are free
One-to-one marketing companies are heavy users of relational database programs. Did you know that they often augment their customers’ information with their own databases? Because most printing is regional in nature, you should consider it your mission to locate and obtain data that helps describe your city, state or region. Do some research, and you’ll be amazed at the amount of interesting data that can be downloaded at no cost. Access student-to-teacher ratios from countries around the world (www.uis.unesco.org), discover the number of serious crimes committed in every county of your state (www.ojp.usdoj.gov), read detailed summaries of employment statistics broken down by state and county (www.bls.gov/data), and more.
Managing much more than mailings
Progressive Solutions (Santa Clara, CA) is a 21-employee, 15,000-sq.-ft. operation specializing in digital printing, mailing and fulfillment.
Originally founded as a print broker, the 14-year-old company now produces as many as 300 jobs a month on equipment that includes an HP Indigo 3050, Xerox DocuColor 2045 and two Canon ImageRunners.
Inventory management is a key challenge, according to Progressive Solutions president Mark Sarpa. "If a client orders 400 copies of a publication or brochure, they might have 200 copies sent directly to them and 95 built into assembled kits," explains Sarpa. "The remaining105 are then added to inventory for future distribution. We need to track inventory on all these levels."
For managing everything from inventory to invoicing to production job management, Progressive Solutions relies on elaborate system of about 50 FileMaker Pro database files containing thousands of individual records. "People don’t think of FileMaker as being on par with an Oracle database, but in many ways it’s just as sophisticated," says Sarpa. He says the program’s flexibility is exactly what Progressive needs. "FileMaker is better because it’s customizable," he says. "Our customers’ files tend to be quite varied. We don’t want to ask them to compromise to fit our system."
Progressive Solutions, a FileMaker user since 1991, recently installed FileMaker Pro 7. Sarpa says he particularly likes its ability to handle multiple files within a file as well as store documents electronically.
"We’re very excited about being able to handle unlimited
tables within a file," says Sarpa. "We can also store and export
documents electronically within FileMaker Pro 7 itself via new
flexible container fields."
Progressive Solutions previously stored clients’ electronic documents on a series of five servers. Files were difficult to access and finding a specific file often required a lot of searching. According to Sarpa, FileMaker Pro 7 lets users call up documents with a simple doubleclick. "If a client calls with a question, we can give a ten-second answer instead of asking to call back with the information later," says Sarpa.
Good addresses but bad prospects?
Some addresses are completely correct, yet still not worth your postage. You can suppress these records from your mailing without deleting them from your database. BCC Software’s Mail Manager 2010 has just expanded its list of suppression options, enabling users to block delivery to those who have joined the Direct Marketing Assn.’s "Do Not Mail" list as well as prisons, nursing homes, colleges, trailer parks, the recently deceased and more.
Contributing editor Hal Hinderliter serves as director of the Graphic Communication Institute at Cal Poly State University (San Luis Obispo, CA). Contact him at email@example.com.