[Previous] [Next]

Appendix A. Graphics Files and Resources on the Internet


Graphics files may be found in a variety of places on the Internet. They are stored as files in FTP archives, used on World Wide Web (WWW) pages as wallpaper and menus, exchanged between people as electronic mail, and distributed around the earth on the USENET global bulletin board system (BBS).

Graphics files are just chunks of data. The Internet was specifically designed to move chunks of data, easily and efficiently, from one computer to another. So you can probably guess that there is more than one way to send, retrieve, store, find, and view graphics files on the Internet.

This section explores a number of ways you can use the information services found on the Internet to collect, transport, and distribute graphics files. These include email, USENET, FTP, Archie, and the World Wide Web. We'll also briefly mention the Internet etiquette, or netiquette, that you should follow when you use these services.

Contents:
Encoding of Graphics Files
Email
USENET News
Mailing Lists
FTP Archives
Archie
The World Wide Web (WWW)
Internet Graphics Resources
For Further Information

Encoding of Graphics Files

Before discussing specific Internet services, let's look at a legacy of the Internet, the 7-bit data path, and how this affects the transmission and handling of graphics files.

It is reasonable to expect that a byte sent from one computer along a data path to another computer will retain the value it stores. For example, if I send from my computer a byte of data containing the value A and if the byte arrives at your computer still containing the value A, we can say that an error-free transfer of data has occurred. If I send you the byte value A, but when you receive this byte, it contains the value Q instead, we can say that a data transmission error has occurred.

Parity

A very popular method of detecting such data transmission errors is called single-bit parity. Parity is used to determine whether the bits in a received byte are the same value they were when the byte was sent. The lower seven bits of each byte contain the data (hence the term 7-bit data path), and the eighth bit contains the parity information. We refer to these bits as the data bits and the parity bit, respectively.

Parity may use either an even or an odd encoding scheme. If a communications link uses an even parity scheme, the parity bit would be 1 if there were an even number of 1's data bits in the byte, and 0 if the number of 1's data bits were an odd number. If a communications link uses an odd parity scheme, it's just the opposite; a parity value of one indicates an odd number of 1's data bits in the byte. Any byte received that contained a mismatch between the number of 1's data bit values and the parity bit value is said to have a parity error and would therefore be considered corrupt.

It should be obvious from this description that parity is a very limited form of error checking, one that has no built-in form of error recovery, and that it is by no means a foolproof method of detecting erroneous data. However, parity is probably the simplest and most inexpensive form of data transmission error detection yet devised. Although parity has been outdated by error-correcting protocols, many of the Internet's communications links still use parity.

Unfortunately, the use of parity error checking prevents the direct transmission of 8-bit binary data. When parity is used, only seven bits in a byte may contain data. Binary data requires eight data bits per byte for storage and transmission, precluding the use of conventional parity schemes. For this reason, data in binary form cannot be reliably sent to any point on the Internet. This presents a problem for us because most graphics files contain binary data.

How, then, do we exchange binary data across the Internet? The solution is to convert our 8-bit binary data to a 7-bit format. ASCII is the de facto standard for 7-bit data on the Internet (although habitual users of mainframes, where one of the several flavors of EBCDIC presides, may disagree). Another de facto Internet standard is used for binary-to-ASCII data conversion and is called uucoding.

Uucoding with uuencode and uudecode

Uucoding (UNIX-to-UNIX coding) is a simple algorithm used to convert three bytes of 8-bit binary data to four bytes of 7-bit ASCII data. The uuencode program converts a binary file to an ASCII equivalent in a process called uuencoding.

A uuencoded file is approximately 30 percent larger than the original file. Converting every three bytes into four accounts for 25 percent of the growth, with the other 5 percent being eaten up by control information. Uuencoding is also perfectly lossless; you will decode the exact file that was encoded every time.

The uuencode and uudecode programs originated on the UNIX operating system, but have long since been ported to almost every other operating system (certainly any operating system running on a computer that exchanges information over the Internet).

Let's look at an example. Suppose that we have a graphics file named toshi.jpg and we want to email it to someone on the Internet. We would first need to convert the binary graphics file to an ASCII uuencoded file by issuing the following command:

uuencode toshi.jpg toshi.jpg > toshi.uue

In this command line, uuencode reads the file toshi.jpg and encodes it using the file label toshi.jpg. (Note that the input filename and the file label need not be the same.) Uuencode always sends its output to the display, but here we've redirected it to the file toshi.uue. If we were to look at the file toshi.uue via a text editor, we might see:

begin 600 toshi.jpg
M```!``$`("`0``````#H`@``%@```"@````@````0`````$`!```````@`(`.
M``````````````````````````````"```"`````@(``@````(``@`"`@```6
[email protected]("``,#`P````/\``/\```#__P#_````_P#_`/__``#___\`````````````L
!____\
end

All of the uuencoded data is contained between the "begin" and "end" lines. The "600" is the UNIX file mode, and "toshi.jpg" is the file label used by uudecode as the name of the file in which to save the uudecoded data.

To convert a uuencoded file back to its original form, we issue the uudecode command:

uudecode toshi.uue

This command reads the toshi.uue file and recreates the original file. Uudecode is also smart enough to strip away all lines that precede the "begin" line and that follow the "end" line. If you need to change the name of the decoded file, you can use a simple text editor to change the file label on the "begin" line.

In this example, the uuencoded file toshi.uue is called a single-part uuencoded file because it is stored in a single file. Large uuencoded files are frequently spilt into smaller parts and are stored in separate files for purposes of posting and emailing. (See the section on email that follows.) These split files are called multi-part uuencoded files.

Uudecoding a multi-part file is an easy job if you have a smart uudecoding program (such as aub, unc, uudo, uuexe, uucat, uuconvert, uulite, or uuxfer) which is able to read the headers of news articles or email messages and to decode the parts in the proper order. But, if you only have a simple uudecoding program that expects all of the data to be in a single file or file stream, then you have a bit of manual work to do.

First, make sure that you have all the parts of the uuencoded file. For example, if the file is separated into three parts, then you should have three files, each with some kind of part designation, such as a "Subject:" or a separator line containing the strings "Part [1/3]", "Part [2/3]", and so on. You must next concatenate these files together in the proper order. With UNIX you would do the following:

cat file.01 file.02 file.03 > file.uue

With MS-DOS, you would type:

copy file.01+file.02+file.03 file.uue

Now edit file.uue and remove all headers and blank lines, returning the uuencoded data to its original contiguous state. This is how the contents of file.uue might look before editing:

[ Start of Part 1/3 ]
begin 644 judi.jpg
M```!``$`("`0``````#H`@``%@```"@````@````0`````$`!```````@`(`.
M``````````````````````````````"```"`````@(``@````(``@`"`@```6
[email protected]("``,#`P````/\``/\```#__P#_````_P#_`/__``#___\`````````````L
[ End of Part 1/3 ]
[ Start of Part 2/3 ]
M875D(')A=&[email protected]*#[email protected]("`@("U(/'-T<FEN9SX-"B`@("`@("`@("!W;W)K>
M<R!J=7-T(&QI:[email protected][email protected]@97AC97!T('[email protected]:6YS=&5A9"!O9B`@<V5T=&ENG
M9R`@=&AE#[email protected]("`@("`@("`@:&ED:6YG("!F;&%G("!F;W(@82!H96%D97(@2
[ End of Part 2/3 ]
[ Start of Part 3/3 ]
[email protected]?__^`#___``?__P`'E
M__^``___P`/__^`[email protected]`?__D`#__P``__\``/__``#[email protected]`___X`?______S
!____\
``
end
[ End of Part 3/3 ]

And after editing:

begin 644 judi.jpg
M```!``$`("`0``````#H`@``%@```"@````@````0`````$`!```````@`(`.
M``````````````````````````````"```"`````@(``@````(``@`"`@```6
[email protected]("``,#`P````/\``/\```#__P#_````_P#_`/__``#___\`````````````L
M875D(')A=&[email protected]*#[email protected]("`@("U(/'-T<FEN9SX-"B`@("`@("`@("!W;W)K>
M<R!J=7-T(&QI:[email protected][email protected]@97AC97!T('[email protected]:6YS=&5A9"!O9B`@<V5T=&ENG
M9R`@=&AE#[email protected]("`@("`@("`@:&ED:6YG("!F;&%G("!F;W(@82!H96%D97(@2
[email protected]?__^`#___``?__P`'E
M__^``___P`/__^`[email protected]`?__D`#__P``__\``/__``#[email protected]`___X`?______S
!____\
``
end

Now, all you need to do is convert the file using uudecode to retrieve the original file.

You may experience a problem with uudecode arising from the fact that the character sequence used to terminate lines in an ASCII file differs depending upon the operating system. UNIX and the Amiga use a linefeed (ASCII 10); the Macintosh uses a carriage return (ASCII 13); and MS-DOS uses both a carriage return and a linefeed in combination.

Many uudecoders (including the original UNIX program) do not handle uuencoded files with something other than native end-of-line character(s) very well. For example, a uudecode program that expects lines of ASCII text to be terminated using only linefeeds will not be able to handle a uuencoded file whose lines are terminated only with carriage returns. The same program may complain if every linefeed in a file is also followed by a carriage return (the notorious, but harmless, "short file" error under UNIX).

For those EBCDIC people who are wondering "What about me? I don't/can't use ASCII!", there is a program called xxencode. This program converts binary files to an EBCDIC-compatible ASCII format that resembles the output from uuencode but is not readable by uudecode. If you've been having problems with uuencoded files being munged by ASCII-to-EBCDIC converters, then using xxencode instead of uuencode on your files may solve your problems.

The uuencode and uudecode programs are included with every flavor of the UNIX operating system. Implementations of these programs have been ported to almost every other operating system and are freely available in most major software archives. However, not all uuencode programs use the same encoding algorithm as the original UNIX uuencode program, or even the same command-line syntax. As uuencode has been ported to other operating systems, people have changed it to make it more efficient or compatible with other utilities, sacrificing the backward compatibility with the original program. This unfortunate occurrence has led to a widespread criticism of the de facto uuencode program and has given rise to a movement to officially replace uuencode with a more standard and robust binary-to-ASCII translation program, such as btoa (binary-to-ASCII) and mmencode (also known as mimencode).


[Previous] [Next]

This page is taken from the Encyclopedia of Graphics File Formats and is licensed by O'Reilly under the Creative Common/Attribution license.