This comes from from corion.net by Max Maischein (used with permission).
This file has some edits (and I am in the process of reformatting it and moving it into my directory structure). You can also see the original files.
The actual formats:
In the file format list, several short mnemonics are used to describe the structure of the data stored. Here I describe the structure (and possible conversion) between some of these types. As some types have different sizes across the platforms, for most types the byte order and bit size is given to describe it. ASCIIZ A sequence of characters(->char), terminated with the special character with the value 0. Note that ASCIIZ strings as most structures on Intel machines should not be larger than 64Kb due to the ancient segmentation used. BCD Binary coded decimal A decimal number is converted into a hexadecimal number which has the same digits as the decimal number. (10d becomes 10h, 21d becomes 21h) Bitmap If a value is declared as bitmapped, that means that every bit in this value might have a different meaning. The bytes are numbered from right to left, the least significant bit has the number 0. After the bit number, there are either two statements, separated by a slash("/"), which are the two meanings if the bit is set / not set, or one single statement, which is the meaning of this bit, if it is set. Byte 8 bit unsigned number. Smallest unit a record consists of. All offsets are in the unit bytes. (0-255) Char Synonym for byte, most values are between 32 and 255. (#0-#255) DWord 32 bit signed number. Well, maybe some of the formats use a DWord which is a 32 bit unsigned number, but as files tend not to be greater than 2GB, this won't be my concern. To convert between Intel and Motorola format, you have to swap bytes #2 & #3 and bytes #1 & #4.(-2Gb-+2Gb) Int Integer. Signed 16-bit number. (-32767-+32767) LString A string which is preceeded by the length. Also named "counted" string. Used by most Pascal implementations Maximum length is 255 bytes, but it can contain any char. Nybble The upper or lower four bits of a byte. A nybble is a single hex digit and can have values from 0 to 15. A signed nybble can have values from -8 to 7 with bit 3 being the sign bit. Paragraph A multiple of 16. A paragraph was the resolution of the Intel chip 64K segments. Word 16 bit unsigned number. Note that byte order is important, wether you have a Motorola machine or an Intel one. Conversion between the two formats is simply by swapping byte #1 with byte #2. (0-65535) How to identify different files While searching for different file formats, I found the following programs helpful to gather information about different files. They all are DOS programs since I'm not familiar with other platforms (except Windows). Most of them should be available on SimTel CDs or via FTP at ftp.cdrom.com, except for my program TF, which is still in beta. LIST.COM v9.0a by Vernon Buerg List is a file lister which supports both text and hex-view. HIEW.EXE v4.18 by Sen Another file lister with build-in disassembler. FILE.EXE v2.0 by Felix von Leitner File is a file identification program. Q.COM v3.01 by SemWare QEdit is the editor I'm editing the list with. TF.EXE v0.38 by me The program that started it all. A "simple" file identification program - no more, since it has grown too big by now. Still unreleased, since it is not really extensible yet. The file formats list meta list ;) The file format list uses a certain format to make it readable by programs which convert it into the WinHelp format or create program structures out of the lists. This format is very similar to the format used by Ralf Brown in his PC interrupt list but was extended by me to accomodate for the specific needs of this list : Each topic in the list is delimited by a line of 45 chars, in which the first 8 contain the char '-'. After these, there follows one character which contains the type of topic. The different topics are described in the list itself, the char '!' denotes an information topic - like the list of chars and their meaning. After the topic identifier, there follows another '-' char and then the topic name, not containing any '-' chars. After the topic name, there may be some other descriptors like for Motorola byte ordering, guesswork marking or other purposes, see the main list for further information. The line is ended with at least one '-' char. Take the following prototype : --------?-TEST------------------------------ OFFSET Count TYPE Description EXTENSION: OCCURENCES: PROGRAMS: REFERENCE: SEE ALSO: VALIDATION: Sub-topics like different records are mostly delimited by three dashes ('-'). I suggest folding them up and making them available as a popup window. Tables have the following format : (see table 0000) for a table reference and (Table 0000) for the beginning of a table. The end of a table is undefined (yet). A primer on file formats Abbrevations Throughout the list, many abbrevations are used, some in the reference section. Here some are explained : c't The c't is a german computer magazine, which developed the Borland Pascal for OS/2 patch. They release source code in files called CTmmyy.*. Note that comments in the source code and the language in the issues tend to be german :-) DDJxxyy (Doctor Dobb's Journal) The DDJ is a monthly publication by M&T/US which is intended for the professional programmer. The four digits after the name indicate the month/year of the issue referred to. Most of the sourcecode published in the issue is available electronically on Compu$erve and other BBSes. The files have the name DDJyymm. PDN Programmer's Distribution Net A network dedicated to the distribution of source code useful to programmers. Often linked with Fido-nodes. Contributions to this list were made by : Ralf Brown (The .EXE file formats from the INTERRUPT List, general layout) David Dilworth ([email protected]) Daniel Dissett ([email protected]) Marcus Groeber ([email protected]) Darrel Hankerson ([email protected]) Carl Hauser ([email protected]) Jouni Miettunen ([email protected]) Jan Nicolai Langfeldt ([email protected]) Mark Ouellet (Telix .FON structures) Greg Roelofs ([email protected]) Robert Rothenburg Walking-Owl ([email protected]) Jesus Villena (CONVERT.EXE, a digital sample conversion program) Christos Zoulas ([email protected]) JAL / Nostalgia David McDuffee, (75530,[email protected]) Information gleaned from other programs : Formats for Word and WordPerfect (Selke's filetype) --------!-CONTACT_INFO---------------------- If you notice any mistakes or omissions, please let me know! It is only with YOUR help that the list can continue to grow. Please send all changes to me rather than distributing a modified version of the list. This file has been authored in the style of the INTERxxy.* file list by Ralf Brown, and uses almost the same format. Please read the file FILEFMTS.1ST before asking me any questions. You may find that they have already been addressed. Max Maischein [email protected] Corion on #coders@IRC --------!-DISCLAIMER------------------------ DISCLAIMER: THIS MATERIAL IS PROVIDED "AS IS". I verify the information contained in this list to the best of my ability, but I cannot be held responsible for any problems caused by use or misuse of the information, especially for those file formats foreign to the PC, like AMIGA or SUN file formats. If an information it is marked "guesswork" or undocumented, you should check it carefully to make sure your program will not break with an unexpected value (and please let me know whether or not it works the same way). Information marked with "???" is known to be incomplete or guesswork. Some file formats were not released by their creators, others are regarded as proprietary, which means that if your programs deal with them, you might be looking for trouble. I don't care about this. --------!-FLAGS----------------------------- One or more letters may follow the file format ID; they have the following meanings: Cx - Charset used : 7 - Unix 7-bit characters A - Amiga charset (if there is one) E - EBDIC character format U - Unicode character set W - Windows char set Default is the 8-Bit IBM PC-II Charset. Note that Microsoft introduced codepages which might be relevant with other programs. G - guesswork, incomplete, unreliable etc. M - Motorola byte order Default is Intel byte order O - obsolete, valid only for version noted below X - Synonym topic. See topic named under see also. --------!-CATEGORIES------------------------ The ninth column of the divider line preceding an entry usually contains a classification code for the application that uses those files. The codes currently in use are: ! - User information ( not really a file format ) A - Archives (ARC,LZH,ZIP,...) a - Animations (CEL, FLI, FLT,...) B - Binary files for compilers etc. (OBJ,TPU) H - Help file (HLP,NG) I - Images, bit maps (GIF,BMP,TIFF,...) D - Data support files (CPI,FON,...) E - Executable files (EXE,PIF) f - Generic file format. RIFF and IFF are generic file formats. F - Font files (TTF) G - General graphics file M - Module music file (MIDI,MOD,S3M,...) R - Resource data files (RES) S - Sound files (WAV,VOC,ZYX) T - Text files (DOC,TXT) W - Spreadsheet and related (WKS) X - Database files (DBF) --------!-FIELDS---------------------------- After a format description, you will sometimes find other keywords. The meanings of these are : EXTENSION: This is the default extension of files of the given type. On DOS systems, most files have a 3 letter extension. On Amiga systems, the files are prefixed with something. The DOS extensions are all uppercase, extensions for other systems are in lower case chars. On other systems, which do not have the con- cept of extensions, as the MAC, this is the file type. OCCURENCES: Where you are likely to encounter those files. This specifies machines (like PC,AMIGA) or operating systems (like UNIX). PROGRAMS: Programs which either create, use or convert files of this format. Some might be used for validation or conversion. REFERENCE: A reference to a file or an article in a magazine which is mandatory or recommended for further understanding of the matter. SEE ALSO: A cross reference to a topic which might be interesting as well. VALIDATION: Methods to validate that the file you have is not corrupt. Normally this is a method to check the theoretical file size against the real filesize. Some file formats allow no reliable validation. --------!-FORMAT---------------------------- The block oriented files are organized in some other fashion, since the order of blocks is at best marginally obligatory. Each block type starts with the block ID (eg. RIFFblock for a RIFF file) and in square brackets the character value of the ID field (eg. [WAVE] for RIFF WAVe sound files). The block itself is descripted in the format description, that means you will have to look after RIFF or FORM. In the record description, the header information is omitted ! If a record is descripted, the record ends when the next offset is given. Bitmapped values have a description for each bit. The value left of the slash ("/") is for the bit not set (=0), the right sided value applies if the bit is set. A note on the tables section. The tables were added as they were introduced into Ralf Browns interrupt list - so not everything was pressed into a table. The tables (should) have unique numbers, but they sure are out of order ! --------!-MACHINES-------------------------- Machines that use Intel byte ordering PC Machines that use Motorola byte ordering AMIGA, ATARI ST, MAC, SUN --------B-BGI-G----------------------------- The BGI files are graphic drivers used by the Borland compilers to provide graphics output for different graphics cards. They are loaded dynamically. The exact format is not known to me ... OFFSET Count TYPE Description 0000h 4 char ID='FBGD' 0004h 1 dword ID=08080808h used to backspace over ID if typing the file 0008h ? char Driver ID string, terminated with #26 EXTENSION:BGI OCCURENCES:PC PROGRAMS:Borland Pascal, Borland C, Turbo Pascal --------M-CMF-G----------------------------- The CMF files are music files used by the SoundBlaster sound card family. The Creative Labs Music Format might be proprietary, the info is guesswork. OFFSET Count TYPE Description 0000h 4 char ID="CTMF" ********* EXTENSION:CMF OCCURENCES:PC PROGRAMS:PLAYCMF.EXE --------E-CORE-G---------------------------- The core images are dumps of the system core from different unix machines (as far as I gather). Info comes from a magic file - so this is only good for identification. What you would do with a core image on a foreign machine, eludes me anyway. Maybe the information below is wrong and the 386 core dump also belongs to the word at 0174h... OFFSET Count TYPE Description 0000h 4 char ID='core' 0174h 1 word Executable type 1 015Dh - B370 executable 5D01h - B370 executable 0158h - B370 executable 5801h - B370 executable 015Fh - XA370 executable 05F01h - XA370 executable 015Ah - XA370 executable 0176h 1 word Executable type 2 0176h - 386 executable EXTENSION:??? OCCURENCES:Unix flavours PROGRAMS:N/A SEE ALSO: --------D-CPI-G----------------------------- The DOS CPI files are data files which are loaded by the country drivers of MS-DOS. The information comes from a magic file, which makes it good for identification only. OFFSET Count TYPE Description 0000h 9 char ID=255,'FONT ',0 EXTENSION:CPI OCCURENCES:PC PROGRAMS:MS-DOS --------X-CRD-G----------------------------- The Windows 3.1 Cardfile.EXE is a (simple) addressbook application included with the Windows 3.1+ operating system by Microsoft. OFFSET Count TYPE Description EXTENSION:CRD OCCURENCES:PC, ALPHA? PROGRAMS:CARDFILE.EXE --------?-DMS------------------------------- The DMS (Digital Music System??) are some other files I found on a mixed system CD, so I include them in my listing. They are Amiga files, so here's the call to the Amiga folks again. OFFSET Count TYPE Description 0000h 4 char ID="DMS!" EXTENSION:DMS OCCURENCES:Amiga --------A-DWC-?----------------------------- The DWC archives seem to be a relict from ancient computing times - I've never seen any program that dealt with them or could create them. They are yet included in this compilation for reasons I don't know. But maybe one of you stumbles over such a file, he might find this documentation helpful. The DWC archives consist of single file entries with one archive trailer. The archive entries seem to be at the start of the archive, but maybe they are stored at the end of the archive, before the trailer. Each file header has the following format : OFFSET Count TYPE Description 0000h 13 char Name of the original file in ASCIIZ. 000Dh 1 dword Size of the original file 0011h 1 dword MS-DOS date and time of the original file 0015h 1 dword Size of the compressed file 0019h 1 dword Offset of compressed data in archive file 001Dh 3 byte reserved 0020h 1 byte Method : 1 - crunched 2 - stored The trailer at the end of each archive has the following format : OFFSET Count TYPE Description 0000h 1 word Length of trailer (=27) 0002h 1 word Size of the directory entries (=34)?? 0004h 16 byte reserved 0014h 1 dword Count of the directory entries 0018h 3 char ID="DWC" EXTENSION:DWC?? OCCURENCES:PC?? PROGRAMS:DWC.EXE?? --------S-EFE------------------------------- The EFE files are instrument files for the Ensoniq sampler system. Further information wanted. EXTENSION:EFE SEE ALSO:GKH,INS --------E-EXE-X----------------------------- Different types of executables have emerged on the Intel DOS related platforms - but all contain at least a stub MZ Exe before their actual EXE body... SEE ALSO:MZ EXE,NE EXE --------D-FON-?----------------------------- The Telix .FON files are the telephone books Telix uses to store numbers in. The format is for Telix 3.22 OFFSET Count TYPE Description 0000h 1 dword ID=2E2B291Ah 0004h 1 word Version info (=1) 0006h 1 word Number of entries in directory (count from 1) 0007h 1 char ?will be used for encryption? Currently 0 0008h 55 byte reserved 0040h ? rec Actual phonebook entry 25 char Name (0 terminated) 17 char Phone number (0 terminated) 1 byte Baud rate (see table 0006) 1 byte Parity type (see table 0007) 1 byte Data bits (7 or 8) 1 byte Stop bits (1 or 2) 12 char Script file name 6 char Date of last call in ASCII 1 word Number of total calls 1 byte Terminal type (see table 0008) 1 byte Protocol 1 byte Flags, bitmapped 0 - Local echo on / off 1 - add linefeeds on / off 2 - backspace is destructive on / off 3 - backspace sends DEL / sends BS 4 - strip high bits on / off 5-7 - reserved 1 word unknown 1 byte Dial prefix index 14 char Password (Table 0006) Baud rate tables for Telix 0 = 300 baud 1 = 1200 baud 2 = 2400 baud 3 = 4800 baud 4 = 9600 baud 5 = 19200 baud 6 = 38400 baud 7 = 57600 baud 8 = 115200 baud (Table 0007) Parity types for Telix 0 = None 1 = Even 2 = Odd 3 = Mark 4 = Space (Table 0008) Terminal types for Telix 0 = TTY 1 = ANSI-BBS 2 = VT102 3 = VT52 4 = AVATAR 5 = ANSI EXTENSION:FON OCCURENCES:PC PROGRAMS:Telix v3.22 REFERENCE: SEE ALSO: VALIDATION: --------M-FPT------------------------------- The Fandarole Pattern files are used by the Fandarole Composer to store single patterns in a file. OFFSET Count TYPE Description 0000h 4 char ID='FPT',254 0004h 32 char ASCII pattern name 0024h 3 char ID=10,13,26 0027h 1 word Remaining size of file (size of pattern) 0029h 1 byte Break location (length of pattern) 002Ah 1 byte reserved 002Bh ? byte Pattern in raw format like in the .FAR file EXTENSION:FAR,FPT OCCURENCES:PC PROGRAMS:Fandarole Composer SEE ALSO:FAR,FSM VALIDATION: --------S-FSM------------------------------- The .FSM files are samples to be used for module style music with the Fandarole Composer. Currently only samples of up to 64K length are supported, altough the header reserves a dword for the sample size. OFFSET Count TYPE Description 0000h 4 char ID='FSM',254 0004h 32 char ASCII name of sample 0024h 3 char ID=10,13,26 0027h 1 dword Length of sample (<=64K) 0028h 1 byte Fine tune value for sample (currently unsupported) 0029h 1 byte Sample volume (currently unsupported) 002Ah 1 dword Start of sample loop 002Dh 1 dword End of sample loop. If the sample is not set to loop (see below) this should be set to the end of the sample. 0032h 1 byte Sample type, bitmapped 0 - 8-bit/16-bit sample 1-7 - reserved 0033h 1 byte Loop mode, ?bit mapped? 0-2 - reserved 3 - loop off/loop on 4-7 - reserved 0034h ? byte Sample data in signed format EXTENSION:FSM OCCURENCES:PC PROGRAMS:Fandarole Composer REFERENCE: SEE ALSO:FAR,USM VALIDATION: --------S-GKH------------------------------- The GKH files are disk images of the Ensoniq EPS sampler system. Further information is missing. EXTENSION:GKH SEE ALSO:EFE,INS --------a-GRASPRT GL-G---------------------- The .GL animation files are graphic animations, some just .GIF files, others mini-movies, used mostly for x-rated adult animations. The format of the files is plain guesswork by me. The analyzed file did not include any animations but only .GIF files and two text files which seemed to be the animation script. There is no safe way of identifying a file as a GL animation, maybe except for adding the subfile sizes and the header size and then check if this matches the file size. OFFSET Count TYPE Description 0000h 1 word Length of header, excluding this word ="HLN" 0002h ? rec The directory entries for each file 1 dword Offset of the stored file 12 char DOS file name of the stored file 0002h+ 1 dword Length of the first stored file "HLN" ? byte The first file The other files follow in similar manner, length->file->length->file EXTENSION:GL OCCURENCES:PC PROGRAMS:GRASPRT --------?-GRIB------------------------------ The GRIB weather product information files just might be some satellite images or something else. I have only seen this signature in a magic file and further informations about the format is not known to me. OFFSET Count TYPE Description 0000h 4 char ID='GRIB' EXTENSION:??? OCCURENCES:??? PROGRAMS:??? --------A-HA-------------------------------- HA files (not to be confused with HamarSoft's HAP files [3]) contain a small archive header with a word count of the number of files in the archive. The constituent files stored sequentially with a header followed by the compressed data, as is with most archives. The main file header is formatted as follows: OFFSET Count TYPE Description 0000h 2 char ID='HA' 0002h 1 word Number of files in archive Every compressed file has a header before it, like this : OFFSET Count TYPE Description 0000h 1 byte Version & compression type 0001h 1 dword Compressed file size 0005h 1 dword Original file size 0009h 1 dword CCITT CRC-32 (same as ZModem/PkZIP) 000Dh 1 dword File time-stamp (Unix format) ? ? char ASCIIZ pathname ? ? char ASCIIZ filename ????h 1 byte Length of machine specific information ? byte Machine specific information Note that the path separator for pathnames is the 0FFh (255) character. The high nybble of the version & compression type field contains the version information (0=HA 0.98), the low nybble is the compression type : (Table 0012) HA compression types 0 "CPY" File is stored (no compression) 1 "ASC" Default compression method, using a sliding window dictionary with an arithmetic coder. 2 "HSC" Compression using a "finite context [sic] model and arithmetic coder" 14 "DIR" Directory entry 15 "SPECIAL" Used with HA 0.99B (?) Machine specific information known: 1 byte Machine type (Host-OS) 1 = MS DOS 2 = Linux (Unix) ? bytes Information (currently only file-attribute info) EXTENSION:HA OCCURENCES:PC, Linux PROGRAMS:HA REFERENCE: --------I-HSI1------------------------------ The HSI1 images are a JPEG derivative made by Handmade Software for their Image Alchemy package. OFFSET Count TYPE Description 0000h 4 char ID='HSI1' EXTENSION:JPG OCCURENCES:PC,SUN PROGRAMS:Image Alchemy REFERENCE: SEE ALSO:JPEG VALIDATION: --------A-HYP------------------------------- The Hyper archiver is a very fast compression program by P. Sawatzki and K.P. Nischke, which uses LZW compression techniques for compression. It is not very widespread - in fact, I've yet to see a package distributed in this format. OFFSET Count TYPE Description 0000h 1 byte ID=1Ah 0001h 2 char Compression method "HP" - compressed "ST" - stored 0003h 1 byte Version file was compressed by in BCD 0004h 1 dword Compressed file size 0008h 1 dword Original file size 000Ch 1 dword MS-DOS date and time of file (see table 0009) 0010h 1 dword CRC-32 of file 0014h 1 byte MS-DOS file attribute 0015h 1 byte Length of filename ="LEN" 0016h "LEN" char Filename EXTENSION:HYP OCCURENCES:PC PROGRAMS:HYPER.EXE --------S-INS------------------------------- The INS files are instrument files for the Ensoniq sampler system. Further information wanted. EXTENSION:INS SEE ALSO:EFE,GKH --------I-LBM-M----------------------------- The LBM/ILBM format is used by Deluxe Paint to store bitmap images. It uses the IFF file format and Motorola byte order. FORMblock [BMHD] This block contains the information about the image. OFFSET Count TYPE Description 0000h 1 word The image width (x-axis) 0002h 1 word The image height (y-axis) 0004h 1 dword reserved 0008h 1 byte Bits per pixel 0009h 1 byte ??reserved?? FORMblock [BODY] This block contains the (compressed) image data... **** FORMblock [CRGN] This block contains palette information for a range of palette entries. OFFSET Count TYPE Description FORMblock [TINY] This block contains a small image used for previewing. OFFSET Count TYPE Description EXTENSION:IFF,LBM OCCURENCES:AMIGA,PC PROGRAMS:Deluxe Paint REFERENCE:??? SEE ALSO:IFF --------A-LBR------------------------------- The LBR files consist of a direcotry and one or more "members". The directory contains from 4 to 256 entries and each entry describes one member. The first directory entry describes the directory itself. All space allocations are in terms of sectors, where a sector is 128 bytes long. Four directory entries fit in one sector thus the number of directory entries is always evenly divisible by 4. Different types of LBR files exist, all versions are discussed here, the directory entry looks like this : OFFSET Count TYPE Description 0000h 1 byte File status : 0 - active 254 - deleted 255 - free 0001h 11 char File name in FCB format (8/3, blank padded), directory name is blanks for old LU, ID='********DIR' for LUPC 000Ch 1 word Offset to file data in sectors 000Eh 1 word Length of stored data in sectors For the LUPC program, the remaining 16 bytes are used like this : OFFSET Count TYPE Description 0000h 8 char ASCII date of creation (MM/DD/YY) 0008h 8 char ASCII time of creation (HH:MM:SS) For the LU86 program, the remaining 16 bytes are used like this : OFFSET Count TYPE Description 0000h 1 word CRC-16 or 0 0002h 1 word Creation date in CP/M format 0004h 1 word Creation time in DOS format 0006h 1 word Date of last modification, CP/M format 0008h 1 word Time of last modification, DOS format 000Ah 1 byte Number of bytes in last sector 000Bh 5 byte reserved (0) EXTENSION:LBR OCCURENCES:PC,CP/M PROGRAMS:LU.COM, LUU.COM, LU86.COM SEE ALSO: --------A-MS COMPRESS 5.0-G----------------- Microsoft ships its files compressed with COMPRESS.EXE, for expansion the program EXPAND.EXE (how original ;) ) is used. The program EXPAND.EXE is available with every copy of MS-DOS 5.0+, the program COMPRESS.EXE is available with several development kits, I found it with Borland Pascal 7.0. The compression seems to be some kind of LZ-Compression, as the fully compatible? LZCopy command under Windows can decompress the same files. This compression feature seems to be available on all DOS-PCs. OFFSET Count TYPE Description 0000h 4 char ID='SZDD' 0004h 1 long reserved, always 3327F088h ? 0008h 1 byte reserved 0009h 1 char Last char of filename if file was compressed into "FILENAME.EX_". 000Ah 1 long Original file size 000Eh 1 byte reserved, varies... EXTENSION:*.??_ OCCURENCES:PC PROGRAMS:COMPRESS.EXE, EXPAND.EXE, LZEXPAND.DLL REFERENCE:?Windows SDK? SEE ALSO:MS COMPRESS 6.22+ VALIDATION: --------A-MS COMPRESS 6.22+-G--------------- At least with the version 6.22 of MS-DOS, Microsoft changed their compression program to a new signature; The program seems no more to be able to restore files to their original name, if it is not given on the command line. OFFSET Count TYPE Description 0000h 4 char ID="KWAJ" 0004h 1 long reserved, always 0D127F088h ? 0008h 1 long reserved, always 00120003h ? 000Ch 1 word reserved, always 01 ? EXTENSION:*.??_ OCCURENCES:PC PROGRAMS:COMPRESS.EXE, EXPAND.EXE, LZEXPAND.DLL REFERENCE:?Windows SDK? SEE ALSO:MS COMPRESS 5.0 VALIDATION: --------I-MSK------------------------------- The MSK files are mask files used by the Autodesk Animator and Animator Pro packages. Two types of MSK files exist. The Animator Pro version is simply a PIC file with the depth 1; A MSK file created by the original Animator is exactly 8000 bytes long. There is no file header or other control information in the file. It contains the image bit map, 1 bit per pixel, with the leftmost pixels packed into the high order bits of each byte. The size of the image is fixed at 320x200. The image is stored left-to-right, top-to-bottom. EXTENSION:MSK OCCURENCES:PC PROGRAMS:Autodesk Animator SEE ALSO:PIC,FLIc --------M-MTS------------------------------- The Master Tracker program by the french demo group Arkham is a tracker for AdLib, SB and speaker - the further limits of this tracker are unknowm to me. OFFSET Count TYPE Description 0000h 6 char ID="MTRAC " 0006h 20 char Song name, zero padded EXTENSION:MST OCCURENCES:PC PROGRAMS:Master Tracler v1.0 SEE ALSO:MOD --------H-NG-G------------------------------ Information about this format comes only from a magic file, thus is only good for file identification. I did not test it, since I don't have any NG files. The Norton Guides are a popup help program for the IBM PCs which provide instant help anywhere... OFFSET Count TYPE Description 0000h 2 char ID='NG' 0002h 1 dword ID=0 EXTENSION:NG OCCURENCES:PC PROGRAMS:NG.EXE SEE ALSO:TPH,HLP --------H-OS/2 HELP------------------------- The OS/2 help files are different from the WinHelp help files,since the WinHelp format is proprietary to MicroSoft because of the patented LZ-packing they implemented. OFFSET Count TYPE Description 0000h 3 char ID='HSP' 0003h 1 byte Flags : 0 - INF style file 1-3 - unknown 4 - HLP style file Patching this file allows reading HLP files using the VIEW command, while HLP files seem to work with INF settings as well. 0005h 1 word Total size of header 0007h 1 word Unknown ????h other data 0047h ? char ASCIIZ name of the HLP/INF file EXTENSION:HLP,INF OCCURENCES:OS/2 REFERENCE:INF02A.DOC SEE ALSO:WinHelp HLP --------X-PARADOX DATAFILES-?--------------- The data files for the paradox database engine have the following format : OFFSET Count TYPE Description 0000h 1 byte Number of bytes per record 0001h 32 byte ???? 0021h 1 byte Number of fields per record 0022h 1 byte ?Password protected? / other flags ? - if password protected, 32 more bytes seem to be inserted. 0023h ?? byte ????? 0058h ? rec 1 byte Field type ? 1 - character field 5 - currency? 6 - integer 1 byte Field length After that, my information becomes really blurry :-I There seems to follow the name of the file, and some 0-filled areas, and after that the "first ASCII character after 0C0h" is said to be the start of the field names. Each field name is in ASCIZ. The actual records start after the field names, either at the 4th byte after 00h 02h (the sequence ending the field names section) or after 00h 02h 00h 00h 00h. EXTENSION:??? OCCURENCES:PC PROGRAMS:Paradox engine SEE ALSO: --------I-PBM-G----------------------------- The PBM files are image files, which were used at least by DMGraph, an utility to insert new graphics into a DOOM WAD file. The image dimensions seem to be stored in ASCII format delimited with CR/LF, after that follows the raw binary image data. OFFSET Count TYPE Description 0000h 1 char ID='P' 0001h 1 char Bitmap type : '1' - PBM bitmap '2' - PGM greymap '3' - PPM pixmap '4' - PBM raw bitmap '5' - PGM raw greymap '6' - PPM raw pixmap EXTENSION:PBM,PGM,PPM OCCURENCES:PC PROGRAMS:DMGraph.EXE --------I-PIC------------------------------- PIC files contain images in an uncompressed format. Both the original Animator and Animator Pro from Autodesk produce PIC files. The file formats are different; Animator Pro produces a hierarchial block oriented file, while the original Animator file is a simpler fixed format. See PIC(Pro) for further information on the Animator Pro PIC format. The original Animator uses this format to store a single-frame picture image. This format description applies to both PIC and original Animator CEL files. The file begins with a 32 byte header, as follows: OFFSET Count TYPE Description 0000h 1 word ID=9119h 0002h 1 word Width of image; PIC files have always a width of 320, CEL images may have any value. 0004h 1 word Height of image, 200 for a PIC, any value for a CEL file. 0006h 1 word X offset of image, always 0 for a PIC image, may be nonzero in a CEL image. 0008h 1 word Y offset of image. Zero for a PIC file. 000Ah 1 byte Bits per pixel (8) 000Bh 1 byte Compresion flag, always zero 000Ch 1 dword Size of the image data in bytes 0010h 16 byte reserved(0) Immediately following the header is the color map. It contains all 256 palette entries in rgb order. Each of the r, g, and b components is a single byte in the range of 0-63. Following the color palette is the image data, one byte per pixel, from left to right, top to bottom. EXTENSION:PIC,CEL OCCURENCES:PC PROGRAMS:Autodesk Animator SEE ALSO:CEL,FLIc,PIC(PRO) --------I-PIC(PRO)-------------------------- This format description applies to both PIC and MSK files created with the Autodesk Animator Pro package. The file begins with a 64-byte header defined as follows: Offset Length Name Description 0000h 1 dword The size of the whole file including the size of this header. 0004h 1 word ID=9500h 0006h 1 word Width of the image 0008h 1 word Height of the image 000Ah 1 word X offset of image 000Ch 1 word Y offset of image 000Eh 1 dword User ID, set to zero 0012h 1 byte Bits per pixel (8 for PIC, 1 for MSK) 0013h 45 byte reserved (0) Following the file header are the data blocks for the image. Each data block within a PIC or MSK file is formatted as follows: OFFSET Count TYPE Description 0000h 1 dword The size of the block, including this header. 0004h 1 word Data type ID : 0 - Color palette info 1 - Byte-per-pixel image data 2 - Bit-per-pixel mask data 0006h ? byte Data The type values in the block headers indicate what type of graphics data the block contains. In a PIC_CMAP block, the first 2-byte word is a version code; currently this is set to zero. Following the version word are all 256 palette entries in rgb order. Each of the r, g, and b components is a single byte in the range of 0-255. This type of block appears in PIC files; there will generally be no color map block in a MSK file. In a PIC_BYTEPIXELS block, the image data appears immediately following the 6-byte block header. The data is stored as one byte per pixel, in left-to-right, topD to-bottom sequence. In a PIC_BITPIXELS block, the bitmap data appears immediately following the 6-byte block header. The data is stored as bits packed into bytes such that the leftmost bits appear in the high-order positions of each byte. The bits are stored in left-to-right, top-to bottom sequence. When the width of the bitmap is not a multiple of 8, there will be unused bits in the low order positions of the last byte on each line. The number of bytes per line is ((width+7)/8). This type of block appears in MSK files. EXTENSION:PIC,MSK OCCURENCES:PC PROGRAMS:Autodesk Animator Pro REFERENCE: SEE ALSO:PIC,FLT --------I-PLY------------------------------- The PoLYgon files created by the Autodesk Animator packages contain a set of points that describe a polygon. OFFSET Count TYPE Description 0000h 1 word Number of points in the file 0002h 1 dword reserved (0) 0006h 1 byte Closed shape flag. If nonzero there is an implied connection between the last and the first point. If it is zero, the shape is open. 0007h 1 byte ID=99h After the header, there follows the point data, organized in records like this : OFFSET Count TYPE Description 0000h 1 word X coordinate 0002h 1 word Y coordinate 0006h 1 word Z coordinate, always zero EXTENSION:PLY OCCURENCES:PC PROGRAMS:Autodesk Animator --------M-PTM------------------------------- Poly Tracker is a Scream Tracker 3 like tracker written by Lone Ranger of AcmE. This is a description of version 2.03 of the PTM format. Early formats are no longer used or supported by the current version of Poly Tracker (it still says "version 1.0β", but there have been about a dozen different versions, including some customized test versions). The samples are stored using delta-compression. OFFSET Count TYPE Description 0000h 28 char Songname in ASCIZ format, 0 padded 001Ch 1 char ID=#26 001Dh 1 word File type version, currently 0203h 001Fh 1 byte reserved (0) 0020h 1 word Number of orders ="ORD" 0022h 1 word Number of instruments ="INS" 0024h 1 word Number of patterns ="PAT" 0026h 1 word Number of voices used ="CHN" 0028h 1 word File flags (always 0 ??) 002Ah 1 word reserved (0) 002Ch 4 char ID='PTMF' 0030h 16 byte reserved (0) 0040h 32 byte Pan settings for each channel : 0 = left, 7 = middle, 15 = right 0060h 256 byte Order list, valid entries are 0.."ORD" 0160h 128 word (Pattern offsets) div 16 The instruments data follows immediately after the header. --- PTM instrument format There are 0.."INS" instruments in the file, each of the following format : OFFSET Count TYPE Description 0000h 1 byte Sample type (bit mapped) 0,1 : 0 - no sample (instrument info only) 1 - normal sample (FileOfs / Length fields are valid) 2 - OPL2 / OPL3 instrument (not used) 3 - MIDI instrument (not used) 2 - sample loop (0 = no loop, 1 = loop) 3 - loop type (0 = unidirectional, 1 = bidirectional) 4 - sample resolution (0 = 8 bits, 1 = 16 bits) 0001h 12 char Name of external sample file 000Dh 1 byte Default volume for sample 000Eh 1 word C4 speed 0010h 1 word reserved (0) 0012h 1 dword absolute? offset of sample data 0016h 1 dword Size of sample in bytes 001Ah 1 dword Start of loop 001Eh 1 dword End of loop 0022h 13 byte reserved (0) 0030h 28 char ASCIZ name of sample 004Ch 4 char ID='PTMS' EXTENSION:PTM OCCURENCES:PC --------I-QFX------------------------------- QFX files are yet another graphic file format used to store received fax images. The .QFX file format is proprietary to Smith Micro Software, Inc. and is used by the Quick Link II fax software. The QFX file header is exactly 1536 bytes long. The fax pages themselves are stored in byte aligned, bit reversed T4 format terminated with 6 EOL's. See CCITT Recommendation T.4 for full documentation on this coding scheme. OFFSET Count TYPE Description 0000h 8 char ID='QLIIFAX',0 0008h 1 word Number of pages in the QFX file 000Ah 1 word Number of scan lines on last page 000Ch 1 dword Number of scan lines for all pages 0010h 1 word Horizontal scaling 1 - High res (200x200), 2 - Normal res (200x100) 0012h 1 word Vertical scaling (always = 1). 0014h 12 byte reserved 0020h 375 dword Offsets of the single pages in the document. Page 1 always starts at offset 1536. The last non-zero dword points to the end of the last page, the first zero dword marks the end of the pages. 0600h ? byte Start of fax page images EXTENSION:QFX OCCURENCES:PC PROGRAMS:Quick Link II --------S-RAW------------------------------- The RAW files are raw signed PCM sound files. PCM means Pulse Code Modulation - which can be played through most sound devices without further manipulation. There is no header or whatsoever. The properties include 8/16-bit samples in INTEL order, stereo or mono format. No identification is possible. EXTENSION:RAW SEE ALSO:SND --------I-RDIB------------------------------ The RDIB files are Device Independent Bitmaps used by Windows. They are RIFF format files. The blocks are unknown to me. SEE ALSO:RIFF --------S-S3I------------------------------- This is the Digiplayer/ST3.0 digital sample file format. The sample files include information about the loop of the instrument. The AdLib instruments have another format listed below. OFFSET Count TYPE Description 0000h 1 byte ID=01h 0001h 12 char DOS filename 000Dh 1 byte reserved (0) 000Eh 1 word Paragraph offset of the raw sample data from beginning of file. 0010h 1 dword Sample length in bytes 0014h 1 dword Start of sample loop 0018h 1 dword End of sample loop 001Ch 1 byte Playback volumne of sample 001Dh 1 byte ??? "DSK" what ever that means 001Eh 1 byte Pack type 0 - unpacked 1 - DP30ADPCM 1 001Fh 1 byte Flags (bitmapped) 0 - loop on/off 1 - stereo sample (length bytes for left channel, then another length bytes for right channel!) 2 - 16-Bit samples (in Intel byte order) 0020h 1 dword C2 frequency 0024h 1 dword reserved 0028h 1 word reserved 002Ah 1 word ID=512 002Ch 1 dword ?? Date of last modification ?? (see table 0009) 0030h 28 char ASCIIZ Sample name 003Ch 4 char ID='SCRS' 0040h ? byte Raw sample data Here follows the AdLib instrument format for which I don't know the extension (yet) : OFFSET Count TYPE Description 0000h 1 byte Instrument type 2 - melodic instrument 3 - bass drum 4 - snare drum 5 - tom tom 6 - cymbal 7 - hihat 0001h 12 char DOS file name 000Dh 3 byte reserved 0010h 1 byte Modulator description (bitmapped) 0-3 - frequency multiplier 4 - scale envelope 5 - sustain 6 - pitch vibrato 7 - volume vibrato 0011h 1 byte Carrier description (same as modulator) 0012h 1 byte Modulator miscellaneous (bitmapped) 0-5 - 63-volume 6 - MSB of levelscale 7 - LSB of levelscale 0013h 1 byte Carrier description (same as modulator) 0014h 1 byte Modulator attack / decay byte (bitmapped) 0-3 - Decay 4-7 - Attack 0015h 1 byte Carrier description (same as modulator) 0016h 1 byte Modulator sustain / release byte (bitmapped) 0-3 - Release count 4-7 - 15-Sustain 0017h 1 byte Carrier description (same as modulator) 0018h 1 byte Modulator wave select 0019h 1 byte Carrier wave select 001Ah 1 byte Modulator feedback byte (bitmapped) 0 - additive synthesis on/off 1-7 - modulation feedback 001Bh 1 byte reserved 001Ch 1 byte Instrument playback volume 001Dh 1 byte ??? "DSK" 001Eh 1 word reserved 0020h 1 dword C2 frequency 0024h 12 byte reserved 0030h 28 char ASCIIZ Instrument name 004Ch 4 char ID='SCRI' EXTENSION:S3I,SMP OCCURENCES:PC PROGRAMS:ScreamTracker 3.0 SEE ALSO:MTM,S3M,STM --------M-S3M------------------------------- The ScreamTracker composer and the ScreamTracker Music Interface Kit (STMIK) were written by the demo group Future Crew for their demonstrations and released. S3M files are the files of the version 3 of the ScreamTracker. OFFSET Count TYPE Description 0000h 20 char Song name, ASCII, 0 padded 001Ch 1 byte ID=1Ah 001Dh 1 byte Filetype : 16=Module 17=Song ? What is this supposed to mean ? 001Eh 1 word Reserved 0020h 1 word Number of orders in song ="ORD" 0022h 1 word Number of instruments in song ="INS" 0024h 1 word Number of patterns in song ="PAT" 0026h 1 word Song flags, bitmapped 0 - ScreamTracker 2.0 type vibrato 1 - ScreamTracker 2.0 type tempo 2 - Amiga type slides 3 - Zero volume optimizations 4 - Amiga limits 5 - enable filters / sfx 0028h 1 word Tracker version 002Ah 1 word File format version 1=Original format 2=Original format, unsigned samples 002Ch 4 char ID='SCRM' 0032h 1 byte Maximum volume 0033h 1 byte Initial speed 0034h 1 byte Initial tempo 0035h 1 byte Master multiplier Whats this ???? 0036h 12 byte reserved 0040h 32 byte Channel balance settings 0=left 127=right +128=disabled 255=unused 0060h "ORD" byte Ordering sequence of the patterns 0060h "INS" word Offset of the instruments in paragraphs from +"ORD" begin of header (for binary offset, multiply with 16) 0060h "PAT" word Offset of the pattern data from begin of header +"ORD" in paragraphs +"INS"*2 EXTENSION:S3M OCCURENCES:PC PROGRAMS:ScreamTracker 3.0 SEE ALSO:S3I,STM,S2M --------S-SND------------------------------- The SND files are raw unsigned PCM sound files. PCM means Pulse Code Modulation - which can be played through most sound devices without further manipulation. There is no header or whatsoever. The properties include 8/16-bit samples in INTEL order, stereo or mono format. No identification is possible. EXTENSION:SND SEE ALSO:RAW --------S-SDK------------------------------- The SDK files are disk images from disks used by the Roland S-550/S-50/S-330 sampler devices. Further information wanted. EXTENSION:SDK --------S-SDS------------------------------- The SDS files are MIDI Sample Dump Standart files and are used to transfer samples between MIDI devices. Further information wanted. EXTENSION:SDS SEE ALSO:MID,SDX --------S-SDX------------------------------- The SDX file are like the SDS files sample dump files used for transfer of data between MIDI devices. EXTENSION:SDX SEE ALSO:MID,SDS --------S-SMP------------------------------- The SMP files are digital sample files used by Samplevision software. Further information wanted. EXTENSION:SMP --------G-TDDD------------------------------ This format is used by the Imagine rendering package. The names of the blocks are unknown to me. OFFSET Count TYPE Description EXTENSION:IFF OCCURENCES:Amiga,PC PROGRAMS:Imagine package REFERENCE:DDJ0794 SEE ALSO:IFF --------S-TXW------------------------------- The TXW files are disk images used by the Yamaha TX-16W. Further information wanted. EXTENSION:TXW --------S-UWF-G----------------------------- The UWF files are sample files used by the UltraTracker. Further information wanted. OFFSET Count TYPE Description 0000h 32 char ASCIIZ sample name 0020h 1 char ID=1Ah 0021h 1 char ID=10h 0022h 5 char ID='MUWFB' 0027h 1 char ID=0 0028h 6 char Length of sample as ASCII long integer 002Eh 1 word Length of sample ????? EXTENSION:UWF SEE ALSO:ULT --------E-Windows PIF----------------------- Windows also uses the PIF files for better performance under the DOS box. The Windows extension of the original PIF format starts at offset 0171h. OFFSET Count TYPE Description ********* not yet implemented ;-) EXTENSION: OCCURENCES: PROGRAMS: REFERENCE:DDJ #202 SEE ALSO:PIF, WINDOWS NT PIF VALIDATION: --------W-WKS------------------------------- The WKS files are worksheets/spreadsheets used by the Lotus 1-2-3 and Lotus Symphony packages. More information has yet to be found since this information origins from a magic file. OFFSET Count TYPE Description 0000h 5 byte ID=0,0,2,0,4 0005h 1 byte WKS type : 4 - Lotus 1-2-3 v1.A WKS 5 - Symphony 1.0 WKS other - ?WK1 file? (Lotus 2.01+, Symphony 1.1+) EXTENSION:WKS OCCURENCES:PC PROGRAMS:Lotus 1-2-3,Lotus Symphony SEE ALSO:WKS --------T-WORDPERFERCT FILES---------------- The WordPerfect files all have a common header - even tough I don't know anything else about them. OFFSET Count TYPE Description 0000h 4 char ID=255,"WPC" 0004h 4 byte unknown 0008h 1 byte ID=1 0009h 1 byte Filetype (see table 0003) (Table 0003) File types of WordPerfect files 01h - macro file 02h - WordPerfect help file 03h - keyboard definition file 0Ah - document file 0Bh - dictionary file 0Ch - thesaurus file 0Dh - block 0Eh - rectangular block 0Fh - column block 10h - printer resource file (PRS) 11h - setup file 12h - prefix information file 13h - printer resource file (ALL) 14h - display resource file (DRS) 15h - overlay file (WP.FIL) 16h - graphics file (WPG) 17h - hyphenation code module 18h - hyphenation data module 19h - macro resource file (MRS) 1Ah - graphics driver (WPD) 1Bh - hyphenation lex module EXTENSION:various OCCURENCES:PC --------W-WQ1------------------------------- Similar to the WKS spreadsheet files, the Quattro Pro spreadsheet files exist, and their header is somewhat similar. Info again from a magic file which makes only identification possible. OFFSET Count TYPE Description 0000h 1 dword ID=00000200h 0004h 1 char ID='Q' EXTENSION:WQ1 OCCURENCES:PC PROGRAMS:Borland Quattro Pro REFERENCE: SEE ALSO:WKS VALIDATION: --------S-ZyXEL----------------------------- The ZyXEL Modems are capable of digitizing speech, the ZFAX software and answering machine software like VoiceConnect store the sampled data in those files. The Modems are capable of compressing the data down to 19.2k CPS (ADPCM) and 9.6k CPS (CELP), the algorithms for the compression may be found in the ZyxelVoc package by N. Igl, but as the firmware on the modems changes, so might the compression algorithm. Playback on the modem is always possible. OFFSET Count TYPE Description 0000h 5 char ID='ZyXEL' 0005h 1 byte 02h, ??? format tag 0006h 4 byte reserved 000Ah 1 word Compression scheme 0 - CELP 1 - 2 bit ADPCM 2 - 3 bit ADPCM 000Ch 4 byte reserved 0010h ? ???? Raw Data The voice data is just the data received from U1496 Modem/Fax. EXTENSION:ZVD,ZYX OCCURENCES:PC PROGRAMS:Voice Connect,ZFAX REFERENCE:ZYXELVOC.* VALIDATION:NONE --------!-HISTORY--------------------------- History is kept within this file for convenience whilst editing ... Date format is european/german, just for my convenience. Date Who What 14.03.95 MM Introduced tables Last table number=0012 05.06.95 MM + PTM format 25.07.95 MM + PIF format + Paradox format description 11.08.95 MM + MS Compress variants 18.11.95 MM + ARC enhancements, caveats + HA files 22.11.95 MM + Parts of the .CRD files 01.02.96 MM + PNG structure 02.02.96 MM + More on JPEG + TARGA entry created