3. Creating Reports¶
Perl has a few special features that let you create simple reports. The reports can have a header area where you can place a title, page number, and other information that stays the same from one page to the next. Perl will track how many lines have been used in the report and automatically generate new pages as needed.
Compared to learning about regular expressions, learning how to create reports will be a breeze. There are only a few tricky parts, which I'll be sure to point out.
This chapter starts out by using the print() function to display a CD collection and then gradually moves from displaying the data to a fully formatted report. The data file shown in Listing 11.1 is used for all of the examples in this chapter. The format is pretty simple: the CD album's title, the artist's name, and the album's price.
|
Listing 11.1-FORMAT.DAT - The Data File |
|
You'll find that Perl is very handy for small text-based data files like this. You can create them in any editor and use any field delimiter you like. In this file, I used an exclamation point to delimit the field. However, I could just as easily have used a caret, a tilde, or any other character.
Now that we have some data, let's look at Listing 11.2, which is a program that reads the data file and displays the information.
| Pseudocode |
|
Open the FORMAT.DAT file. Read all the file's lines and place them in the @lines array. Each line becomes a different element in the array. Close the file. Iterate over the @lines array. $_ is set to a different array element each time through the loop. Remove the linefeed character from the end of the string. Split the string into three fields using the exclamation point as the delimiter. Place each field into the $album, $artist, and $price variables. Print the variables. |
|
Listing 11.2-11LIST02.PL - A Program to Read and Display the Data File |
|
This program displays:
Use of uninitialized value at 11lst02.pl line 8.
Album=The Lion King Artist= Price=
Album=Tumbleweed Connection Artist=Elton John Price=123.32
Album=Photographs & Memories Artist=Jim Croce Price=4.95
Album=Heads & Tales Artist=Harry Chapin Price=12.50
Why is an
error being displayed on the first line of the output? If you said that the
split() function was returning the undefined value when there was no
matching field in the input file, you were correct. The first input line was the
following:
The Lion King!
There are no entries for the Artist or Price
fields. Therefore, the $artist and $price variables were
assigned the undefined value, which resulted in Perl complaining about
uninitialized values. You can avoid this problem by assigning the empty string
to any variable that has the undefined value. Listing 11.3 shows a program that
does this.
| Pseudocode |
|
Open the FORMAT.DAT file, read all the lines into @lines, and then close the file. Iterate over the @lines array. Remove the linefeed character. Split the string into three fields. If any of the three fields are not present in the line, provide a default value of an empty string. Print the variables. |
|
Listing 11.3-11LST03.PL - How to Avoid the Uninitialized Error When Using the Split() Function |
|
| Clarification Note |
The following code lines are responsible for
assigning a null string value to the three variables if no information was
present in the record: $album = "" if !defined($album); $artist = "" if !defined($artist); $price = "" if !defined($price); The defined() function is used to see if each variable is defined. If a variable has no value, then the "" string is assigned to it. |
| Errata Note |
| The printed version of this book showed the split call to be: ($album, $artist, $price) = (split(/::/)); which was incorrect. |
The first four lines this program displays are the following: Album=The Lion King Artist= Price=
Album=Tumbleweed Connection Artist=Elton John Price=123.32
Album=Photographs & Memories Artist=Jim Croce Price=4.95
Album=Heads & Tales Artist=Harry Chapin Price=12.50
The error
has been eliminated, but it is still very hard to read the output because the
columns are not aligned. The rest of this chapter is devoted to turning this
jumbled output into a report.
Perl reports have both heading and detail lines. A heading is used to identify the report title, the page number, the date, and any other information that needs to appear at the top of each page. Detail lines are used to show information about each record in the report. In the data file being used for the examples in this chapter (see Listing 11.1), each CD has its own detail line.
Headings and detail lines are defined by using format statements, which are discussed in the next section.
3.1. What’s a Format Statement?¶
Perl uses formats as guidelines when writing report information. A format is used to tell Perl what static text is needed and where variable information should be placed. Formats are defined by using the format statement. The syntax for the format statement is
format FORMATNAME =
FIELD_LINE
VALUE_LINE
.The FORMATNAME is usually the same name as the file handle
that is used to accept the report output. The section "Example:
Changing Formats," later in this chapter, talks about using the
format statement where the FORMATNAME is different from the
file handle. If you don't specify a FORMATNAME, Perl uses
STDOUT. The FIELD_LINE part of the format statement consists
of text and field holders. A field holder represents a given line width
that Perl will fill with the value of a variable. The VALUE_LINE line
consists of a comma-delimited list of expressions used to fill the field holders
in FIELD_LINE.
Report headings, which appear at the top of each page, have the following format:
format FORMATNAME_TOP =
FIELD_LINE
VALUE_LINE
.Yes, the only difference between a detail line and a heading is that
_TOP is appended to the FORMATNAME.
| Note |
| The location of format statements is unimportant because they define only a format and are never executed. I feel that they should appear either at the beginning of a program or the end of a program, rarely in the middle. Placing format statements in the middle of your program might make them hard to find when they need to be changed. Of course, you should be consistent where you place them. |
A typical format statement might look like this:
format =
The total amount is $@###.##
$total
.The at character @ is used to start a field holder. In this
example, the field holder is seven characters long (the at sign and decimal
point count, as well as the pound signs #). The next section, "Example:
Using Field Lines," goes into more detail about field lines and field
holders.
Format statements are used only when invoked by the write() function. The write() function takes only one parameter: a file handle to send output to. Like many things in Perl, if no parameter is specified, a default is provided. In this case, STDOUT will be used when no FORMATNAME is specified. In order to use the preceding format, you simply assign a value to $total and then call the write() function. For example:
$total = 243.45;
write();
$total = 50.00;
write();
These lines will display:
The total amount is $ 243.45
The total amount is $ 50.50The output will be sent to
STDOUT. Notice that the decimal points are automatically lined up when
the lines are displayed.
3.1.1. Example: Using Field Lines¶
The field lines of a format statement control what is displayed and how. The simplest field line contains only static text. You can use static or unchanging text as labels for variable information, dollar signs in front of amounts, a separator character such as a comma between first and last name, or whatever else is needed. However, you’ll rarely use just static text in your format statement. Most likely, you’ll use a mix of static text and field holders.
You saw a field holder in action in the last section in which I demonstrated sending the report to STDOUT. I'll repeat the format statement here so you can look at it in more detail:
format =
The total amount is $@###.##
$total
.The character sequence The total amount is $ is static text.
It will not change no matter how many times the report is printed. The character
sequence @###.##, however, is a field holder. It reserves seven spaces
in the line for a number to be inserted. The third line is the value line; it
tells Perl which variable to use with the field holder. Table 11.11 contains a
list of the different format characters you can use in field lines.
| Format Character | Description |
|---|---|
| @ | This character represents the start of a field holder. |
| < | This character indicates that the field should be left-justified. |
| > | This character indicates that the field should be right-justified. |
| | | This character indicates that the field should be centered. |
| # | This character indicates that the field will be numeric. If used as the first character in the line, it indicates that the entire line is a comment. |
| . | This character indicates that a decimal point should be used with numeric fields. |
| ^ | This character also represents the start of a field holder. Moreover, it tells Perl to turn on word-wrap mode. See the section "Example: Using Long Pieces of Text in Reports" later in this chapter for more information about word-wrapping. |
| ~ | This character indicates that the line should not be written if it is blank. |
| ~~ | This sequence indicates that lines should be written as needed until the value of a variable is completely written to the output file. |
| @* | This sequence indicates that a multi-line field will be used. |
Let's start using some of these formatting characters by formatting a report to display information about the FORMAT.DAT file we used earlier. The program in Listing 11.4 displays the information in nice, neat columns.
| Pseudocode |
|
Declare a format for the STDOUT file handle. Open the FORMAT.DAT file, read all the lines into @lines, and then close the file. Iterate over the @lines array. Remove the linefeed character. Split the string into three fields. If any of the three fields are not present in the line, provide a default value of an empty string. Notice that a numeric value must be given to $price instead of the empty string. Invoke the format statement by using the write() function. |
|
Listing 11.4-11LST04.PL - Using a Format with STDOUT |
|
This program displays the following:
Album=The Lion King Artist= Price=$ 0.00
Album=Tumbleweed Con Artist= Elton John Price=$123.32
Album=Photographs & Artist= Jim Croce Price=$ 4.95
Album=Heads & Tales Artist= Harry Chapin Price=$ 12.50
You can
see that the columns are now neatly aligned. This was done with the
format statement and the write() function. The format
statement used in this example used three field holders. The first field holder,
@<<<<<<<<<<<<<, created a
left-justified spot for a 14-character-wide field filled by the value in
$album. The second field holder,
@>>>>>>>>>>>>, created a
right-justified spot for a 12-character-wide field filled by the value in
$artist. The last field holder, @##.##, created a
6-character-wide field filled by the numeric value in $price.
You might think it's wasteful to have the field labels repeated on each line, and I would agree with that. Instead of placing field labels on the line, you can put them in the report heading. The next section discusses how to do this.
3.1.2. Example: Report Headings¶
Format statements for a report heading use the same format as the detail line format statement, except that _TOP is appended to the file handle. In the case of STDOUT, you must specify STDOUT_TOP. Simply using _TOP will not work.
To add a heading to the report about the CD collection, you might use the following format statement:
Album Artist Price .format STDOUT_TOP =
@|||||||||||||||||||||||||||||||||||| Pg @<
"CD Collection of David Medinets", $%
Adding this format statement to Listing 11.4 produces this
output:
Album Artist Price Album=The Lion King Artist= Price=\( 0.00
Album=Tumbleweed Con Artist= Elton John Price=\)123.32
Album=Photographs & Artist= Jim Croce Price=\( 4.95
Album=Heads & Tales Artist= Harry Chapin Price=\) 12.50 CD Collection of David Medinets Pg 1
Whenever
a new page is generated, the heading format is automatically invoked. Normally,
a page is 60 lines long. However, you can change this by setting the $=
special variable.
Another special variable, $%, holds the current page number. It will be initialized to zero when your program starts. Then, just before invoking the heading format, it is incremented so its value is one. You can change $% if you need to change the page number for some reason.
You might notice that the | formatting character was used to center the report title over the columns. You might also notice that placing the field labels into the heading allows the columns to be expanded in width.
Unfortunately, Perl does not truly have any facility for adding footer detail lines. However, you can try a bit of "magic" in order to fool Perl into creating footers with static text. The $^L variable holds the string that Perl writes before every report page except for the first, and the $= variable holds the number of lines per page. By changing $^L to hold your footer and by reducing the value in $= by the number of lines your footer will need, you can create primitive footers. Listing 11.5 displays the CD collection report on two pages by using this technique.
| Pseudocode |
|
Declare a format for the STDOUT file handle. Declare a heading format for the STDOUT file handle. Open the FORMAT.DAT file, read all the lines into @lines, and then close the file. Assign a value of 6 to $=. Normally, it has a value of 60. Changing the value to 6 will create very short pages - ideal for small example programs. Assign a string to $^L, which usually is equal to the form-feed character. The form-feed character causes printers to eject a page. Iterate over the @lines array. Remove the linefeed character. Split the string into three fields. If any of the three fields are not present in the line, provide a default value of an empty string. Notice that a numeric value must be given to $price instead of the empty string. Invoke the format statement using the write() function. Print the footer on the last page. You need to explicitly do this because the last page of the report will probably not be a full page. |
|
Listing 11.5-11LST05.PL - Tricking Perl into Creating Primitive Footers |
|
This program displays the following:
Album Artist Price CD Collection of David Medinets Pg 1
3.2. Album=The Lion King Artist= Price=\( 0.00 Album=Tumbleweed Con Artist= Elton John Price=\)123.32¶
Copyright, 1996, Eclectic Consulting
CD Collection of David Medinets Pg 2
Album Artist Price
3.3. Album=Photographs & Artist= Jim Croce Price=\( 4.95 Album=Heads & Tales Artist= Harry Chapin Price=\) 12.50¶
Copyright, 1996, Eclectic ConsultingLet me explain the assignment to $^L in more detail. The assignment is duplicated here for your convenience:
$^L = '-' x 60 . "\n" .
"Copyright, 1996 by Eclectic Consulting\n" .
"\n\n";The first part of the assignment, '-' x 60,
creates a line of 60 dash characters. Then a newline character is concatenated
to the line of dashes. Next, the copyright line is appended. Finally, two more
linefeeds are appended to separate the two pages of output. Normally, you
wouldn't add the ending linefeeds because the form-feed character makes them
unnecessary. Here's how the code would look when designed to be sent to a
printer:
$^L = '-' x 60 . "\n" .
"Copyright, 1996 by Eclectic Consulting" .
"\014";The "\014" string is the equivalent of a
form-feed character because the ASCII value for a form-feed is 12, which is 14
in octal notation.
| Note |
| I feel that it's important to say that the coding style in this example is not really recommended for "real" programming. I concatenated each footer element separately so I could discuss what each element did. The last three elements in the footer assignment should probably be placed inside one string literal for efficiency. |
| Tip |
This example is somewhat incomplete. If the last
page of the report ends at line 20 and there are 55 lines per page, simply
printing the $^L variable will not place the footer at the bottom
of the page. Instead, the footer will appear after line 20. This is
probably not the behavior you'd like. Try the following statement to fix
this problem:
print("\n" x $- . "$^L");
This will concatenate enough linefeeds to the beginning of the footer variable to place the footer at the bottom of the page. |
3.3.1. Example: Using Functions in the Value Line¶
You’ve already seen the value line in action. Most of the time, its use will be very simple: create the field holder in the field line and then put the variable name in the value line. But there are some other value line capabilities you should know about. In addition to simple scalar variables, you can specify array variables and even functions on the value line. Listing 11.6 shows a program that uses a function to add ellipses to a string if it is too wide for a column.
| Pseudocode | ||||||||||||||||||||||
|
Declare a format for the STDOUT file handle. In this example, the value line calls the dotize() function. Declare a heading format for the STDOUT file handle. Declare the dotize() function. Initialize local variables called $width and $string. <P>If the width of <TT>$string</TT> is greater than <TT>$width</TT>,
return a value that consists of <TT>$string</TT> shortened to
<TT>$width-3</TT> with <TT>...</TT> appended to the end; otherwise, return
<TT>$string</TT>.
<P>Open the <TT>FORMAT.DAT</TT> file, read all the lines into
<TT>@lines</TT>, and then close the file.
<P>Iterate over the <TT>@lines</TT> array.
<P>Remove the linefeed character.
<P>Split the string into three fields.
<P>If any of the three fields are not present in the line, provide a
default value of an empty string. Notice that a numeric value must be
given to <TT>$price</TT> instead of the empty string.
<P>Invoke the <TT>format</TT> statement by using the <TT>write()</TT>
function.</TT></P></TD></TR></TBODY></TABLE>
This program displays the following: Album Artist Price The Lion King \( 0.00
Tumbleweed Con... Elton John \)123.32
Photographs & … Jim Croce \( 4.95
Heads & Tales Harry Chapin \) 12.50
3.3.2. Example: Changing Formats¶So far, you’ve seen only how to use a single format statement per report. If Perl could handle only one format per report, it wouldn’t have much utility as a reporting tool. Fortunately, by using the $~ special variable, you can control which format is used for any given write() function call. Listing 11.7 shows a program that tracks the price of the CDs in the collection and displays the total using an alternate format statement.
|