The goal of this guide is to document everything I know about the different versions of the JET file format. Right now this is work in progress, and it probably will remain work in progress for an indeterminate time. I am updating stuff as I'm working on it, but I have no plans on actually finishing this guide.
JET is the file format used by Microsoft Access' MDB and ACCDB databases. These databases are ubiquitous in the Windows world. The file format is used not only by Access, but also as an underlying storage engine by countless third party apps. Unfortunately (but also quite understandably) Microsoft does not provide an open source implementation or a detailed technical specification.
The first page of a database contains various properties.
The first 24 bytes of the header identify the file format:
Name | Offset | Length | Type | Description |
---|---|---|---|---|
Magic Number | 0x00 | 4 bytes | UINT 32 LE | 0x100 |
File format ID | 0x04 | 16 bytes | CHAR | A zero-terminated string identifying the file format
|
Jet Version | 0x14 | 4 bytes | UINT 32 LE | JET file format version
|
The next 126 (Jet 3) / 128 or so (Jet 4) bytes are encrypted with the RC4 key 0x6b39dac7:
0x00 | 34 bytes | |||
System Collation (Jet 3) | 0x22 | 2 bytes | UINT 16 LE | Jet 3 only, default collation. |
System Code Page | 0x24 | 2 bytes | UINT 16 LE | The system code page used in this database. Used for field names etc in Jet 3, used for OLE fields in Jet 4. |
Database Key | 0x26 | 4 bytes | UINT 32 LE | If this is 0, the database is not encoded. |
Database password | 0x2A | 20 bytes (Jet 3) 40 bytes (Jet 4) |
Obfuscated in Jet 4 | |
24 bytes (Jet 3) 4 bytes (Jet 4) |
||||
System Collation (Jet 4) | 0x56 | 2 bytes | UINT 16 LE | Jet 4 only, default collation. Unknown what this contains in Jet 3. |
Creation date | 0x5A | 8 bytes | DOUBLE | The creation date of the database, stored in Microsofts strange date format (days since 1900 or so). Used for obfuscating the password in Jet 4. |
28 bytes (Jet 3) 32 bytes (Jet 4) |
Name | Length | Type | Description |
---|---|---|---|
Page Signature | 2 bytes | UINT 16 LE | Always 0x0102. Identifies this as a table definition page. |
2 bytes | UINT 16 LE |
|
|
Next Page | 4 bytes | UINT 32 LE | If the table definition is too long to fit on a single page, this field contains a pointer to the next page. The next page will again be a table definition page with a header similar to this one, and the data from this page and the next must be concatenated. |
Table Definition Length | 4 bytes | UINT 32 LE | The total length of the table definition. |
Unknown | 4 bytes Jet 4 only |
??? | Unknown field, Jet 4 and later only |
Number of rows | 4 bytes | UINT 32 LE | The total number of rows in the table |
Autonumber | 4 bytes | UINT 32 LE | The next value for the autonumber field |
Autonumber Increment | 4 bytes Jet 4 only |
UINT 32 LE | Jet 4 only, probably the amount that the autonumber is increased everytime (I haven't tested this) |
Complex Autonumber | 4 bytes | UINT 32 LE | Jet 4 only. On Access 2007 and later, this contains the Autonumber for complex fields (shared across all complex fields). Unknown for earlier versions. |
Unknown | 4 bytes | UINT 32 LE | Jet 4 only |
Unknown | 4 bytes | UINT 32 LE | Jet 4 only |
Table Type / Flags? | 1 bytes | UINT 8 | The type of the table, or maybe some table flags? Known values:
|
Next Column Id | 2 bytes | UINT 16 LE | The Column Id that the next column to be created will have. Incremented every time a column is created, never decremented. Equal to the total number of columns in the table including deleted columns. |
Variable Columns | 2 bytes | UINT 16 LE | The number of variable length columns in the table |
Column Count | 2 bytes | UINT 16 LE | The number of columns in the table |
Index Count | 4 bytes | UINT 32 LE | The total number of indexes in the table, including those that aren't real indices. |
Real Index Count | 4 bytes | UINT 32 LE | The number of real indices in the table |
Row Page Map | 4 bytes | UINT 32 LE | A record pointer to a page bitmap of all pages that contain rows in this table (excluding LVAL pages) |
Free Space Page Map | 4 bytes | UINT 32 LE | A record pointer to a page bitmap for pages containing free space (for inserting rows). |
for every real index :
Unknown A1 | 4 bytes | ??? | |
Index Row Count | 4 bytes | UINT 32 LE | Unknown |
Unknown A2 | 4 bytes | ??? | Jet 4 only, always 0 |
for every current column:
Type | 1 byte | UINT 8 | The column type. Known types are:
|
Unknown | 4 bytes Jet 4 only |
UINT 32 LE | Jet 4 only, first unknown definition block |
Column Id | 2 bytes | UINT 16 LE | A unique number for every column, doesn't change |
Variable Column Number | 2 bytes | UINT 16 LE | The offset in the variable length columns list |
Column Index | 2 bytes | UINT 16 LE | The index of this column in the table |
Various | 4 bytes (Jet 4) 6 bytes (Jet 3) |
In Jet 3, textual columns:
Jet 3, decimal columns:
Jet 4, textual columns:
Jet 4, decimal columns:
Jet 4, complex columns:
|
|
Column Flags | 2 bytes (Jet 4) 1 byte (Jet 3) |
UINT 16 LE UINT 8 |
Flags for various column properties:
Note: Complex fields always have 0x0007 as flag. |
Unknown | 4 bytes (Jet 4 only) |
??? | Jet 4 only, usually 0 |
Fixed Offset | 2 bytes | UINT 16 LE | The offset for fixed length columns |
Length | 2 bytes | UINT 16 LE |
for every column:
Name Length | 2 bytes (Jet 4) 1 byte (Jet 3) |
UINT 16 LE UINT 8 |
The length in bytes of the column name |
Column name | The column name in the default codepage (Jet3) or UCS-2 (Jet 4) |
for every real index:
Unknown B1 | 4 bytes (Jet 4 only) |
??? | Jet 4 only Always one of two values:
|
for 10 times:
Column Id | 2 bytes | UINT 16 LE | The Column Id of the indexed column, or 0xFFFF if the index doesn't have so many columns |
Flags | 1 byte | UINT 8 | Index column flags. Only known flag is 0x01 for ascending order. |
Unknown B2 | 4 bytes | ??? | |
First index page | 4 bytes | UINT 32 LE | Points to the first index page |
Flags | 2 bytes (Jet 4) 1 byte (Jet 3) |
UINT 16 LE UINT 8 |
Index flags.
|
Unknown B3 | 4 bytes | ??? | Jet 4 only |
Unknown B4 | 4 bytes | ??? | Jet 4 only Always 0 |
for every index (including those that aren't real):
Unknown C1 | 4 bytes Jet 4 only |
UINT 32 LE | Jet 4 only, first unknown definition block Always one of two values:
|
Index Number | 4 bytes | UINT 32 LE | The number of the index (not necessarily sequential) |
Index Column Number | 4 bytes | UINT 32 LE | An index into the index column list (????) |
Unknown C2 | 1 byte | UINT 32 LE |
|
Unknown C3 | 4 bytes | UINT 32 LE | 0xffffffff for real indices small integer (most often 2, only once 0, usually less than 30) otherwise |
Unknown C4 | 4 bytes | ??? | Always 0 for real indices |
Unknown C5 | 2 bytes | ??? | Probably some flags Almost always 0x00000404 for real indices (single exception) Combinations of the following flags otherwise:
|
Index Type | 1 byte | UINT 8 |
|
Unknown C6 | 4 bytes | ??? | Jet 4 only Always 0 |
for every index:
Name Length | 2 bytes (Jet 4) 1 byte (Jet 3) |
UINT 16 LE UINT 8 |
The length in bytes of the index name |
Column name | The index name in the default codepage (Jet3) or UCS-2 (Jet 4) |
until column id is 0xFFFF:
Column Id | 2 bytes | UINT 16 LE | The Column Id of a variable length column |
Arg 1 | 4 bytes | UINT 32 LE | Some argument |
Arg 2 | 4 bytes | UINT 32 LE | Some other argument |
The OLE fields should actually rather be called long binary fields. They can contain a number of different kinds of values. I know of the following:
They start with an 8 byte header:
Name | Length | Type | Description |
---|---|---|---|
Storage Type | 4 bytes | UINT 32 LE |
|
Uncompressed Size | 4 bytes | UINT 32 LE | The uncompressed size of the package. |
The package then follows this header. If it is a compressed package, the bytes are compressed using the DEFLATE algorithm. Use the zlib library to uncompress them.
The package consists of a short header and the payload:
Name | Length | Type | Description |
---|---|---|---|
Package Header Length | 4 bytes | UINT 32 LE |
|
Unknown | 4 bytes | UINT 32 LE | Contains the value 1 |
Extension Length | 4 bytes | UINT 32 LE | The length (in characters) of the file name extension / file type. |
Extension | 2n bytes | CHAR 16 LE | The file name extension of the payload, zero terminated, in UCS-2 / UTF-16. |
The rest of the package is the payload, ie. the wrapped file.
Access stores various design view properties in the LvProp column of the MSysObjects table. This is the format of the data stored in this field.
Every LvProp field contains a 4 byte signature followed by a number of property blocks.
Name | Length | Type | Description |
---|---|---|---|
LvProp Signature | 4 bytes | Jet 3: 'KKD\0' Jet 4: 'MR2\0' |
now a variable number of property blocks follows:
Block Length | 4 bytes | UINT 32 LE | Contains the length in bytes of this property block. |
Block Type | 2 bytes | UINT 16 LE | The type of this property block.
|
Block Contents | (Block Length)-6 bytes | The contents of the property block, formatted according to the type field. |
There is exactly one block with a type 0x0080. This property block simply contains a list of strings. These strings are referenced by the other property blocks.
Name | Length | Type | Description |
---|---|---|---|
Length | 2 bytes | UINT 16 LE | |
Property Name | (Length) bytes | A property name. |
There is a property value block for the table itself (type 0x0000), and one for every column (type 0x0001). They are formatted as follows
Name | Length | Type | Description |
---|---|---|---|
Header Length | 4 bytes | UINT 32 LE | Length of this header |
Name Length | 2 bytes | UINT 16 LE | Length of the following string in bytes |
Name | (Name Length) bytes | The name of the corresponding column, or empty string if this block is for the table. |
This is followed by a number of records containing the values stored in this property block (repeated until the end of the block):
Length | 2 bytes | UINT 16 LE | Length of this record in bytes |
Unknown | 1 byte | Some unknown flag | |
Type | 1 byte | Type of this value | |
Name index | 2 bytes | UINT 16 LE | The name of this property. This is an index referencing to the 0x0080 block. |
Value Length | 2 bytes | UINT 16 LE | The length of the following string in bytes. |
Value | (Value Length) bytes | The actual value of the property. |