• WAP手机版 加入收藏  设为首页
网站公告

access database mdb file structure study 2

时间:2018-3-27 10:01:26   作者:成都数据恢复中心   来源:网络转载   阅读:2701   评论:0
内容摘要:TechSpecsfortheJETformatusedbyAccess1997-2010ThegoalofthisguideistodocumenteverythingIknowaboutthedifferentversionsoftheJETfileformat.Rightnowthisisworkinprogre...

Tech Specs for the JET format used by Access 1997-2010

The goal of this guide is to document everything I know about the different versions of the JET file format. Right now this is work in progress, and it probably will remain work in progress for an indeterminate time. I am updating stuff as I'm working on it, but I have no plans on actually finishing this guide.

JET is the file format used by Microsoft Access' MDB and ACCDB databases. These databases are ubiquitous in the Windows world. The file format is used not only by Access, but also as an underlying storage engine by countless third party apps. Unfortunately (but also quite understandably) Microsoft does not provide an open source implementation or a detailed technical specification.

内容



First Page

The first page of a database contains various properties.

The first 24 bytes of the header identify the file format:

Name Offset Length Type Description
Magic Number 0x00 4 bytes UINT 32 LE 0x100
File format ID 0x04 16 bytes CHAR A zero-terminated string identifying the file format
  • MDB format (Access 97-2003): "Standard Jet DB"
  • ACCDB format (Access 2007-2010): "Standard ACE DB"
Jet Version 0x14 4 bytes UINT 32 LE JET file format version
  • 0 Access 97 (Jet 3)
  • 1 Access 2000, 2002/2003 (Jet 4)
  • 2 Access 2007
  • 0x103 Access 2010

The next 126 (Jet 3) / 128 or so (Jet 4) bytes are encrypted with the RC4 key 0x6b39dac7:

0x00 34 bytes
System Collation (Jet 3) 0x22 2 bytes UINT 16 LE Jet 3 only, default collation.
System Code Page 0x24 2 bytes UINT 16 LE The system code page used in this database. Used for field names etc in Jet 3, used for OLE fields in Jet 4.
Database Key 0x26 4 bytes UINT 32 LE If this is 0, the database is not encoded.
Database password 0x2A 20 bytes (Jet 3)
40 bytes (Jet 4)
Obfuscated in Jet 4
24 bytes (Jet 3)
4 bytes (Jet 4)
System Collation (Jet 4) 0x56 2 bytes UINT 16 LE Jet 4 only, default collation. Unknown what this contains in Jet 3.
Creation date 0x5A 8 bytes DOUBLE The creation date of the database, stored in Microsofts strange date format (days since 1900 or so). Used for obfuscating the password in Jet 4.
28 bytes (Jet 3)
32 bytes (Jet 4)







Table Definition Pages

Page Header

Name Length Type Description
Page Signature 2 bytes UINT 16 LE Always 0x0102. Identifies this as a table definition page.
2 bytes UINT 16 LE
  • the value 'VC' for Jet 3
  • the number of free bytes on this page for Jet 4
Next Page 4 bytes UINT 32 LE If the table definition is too long to fit on a single page, this field contains a pointer to the next page. The next page will again be a table definition page with a header similar to this one, and the data from this page and the next must be concatenated.

Table Definition data

Table Definition Length 4 bytes UINT 32 LE The total length of the table definition.
Unknown 4 bytes
Jet 4 only
??? Unknown field, Jet 4 and later only
Number of rows 4 bytes UINT 32 LE The total number of rows in the table
Autonumber 4 bytes UINT 32 LE The next value for the autonumber field
Autonumber Increment 4 bytes
Jet 4 only
UINT 32 LE Jet 4 only, probably the amount that the autonumber is increased everytime (I haven't tested this)
Complex Autonumber 4 bytes UINT 32 LE Jet 4 only. On Access 2007 and later, this contains the Autonumber for complex fields (shared across all complex fields). Unknown for earlier versions.
Unknown 4 bytes UINT 32 LE Jet 4 only
Unknown 4 bytes UINT 32 LE Jet 4 only
Table Type / Flags? 1 bytes UINT 8

The type of the table, or maybe some table flags? Known values:

  • 0x4e user table
  • 0x53 system table
Next Column Id 2 bytes UINT 16 LE The Column Id that the next column to be created will have. Incremented every time a column is created, never decremented. Equal to the total number of columns in the table including deleted columns.
Variable Columns 2 bytes UINT 16 LE The number of variable length columns in the table
Column Count 2 bytes UINT 16 LE The number of columns in the table
Index Count 4 bytes UINT 32 LE The total number of indexes in the table, including those that aren't real indices.
Real Index Count 4 bytes UINT 32 LE The number of real indices in the table
Row Page Map 4 bytes UINT 32 LE record pointer to a page bitmap of all pages that contain rows in this table (excluding LVAL pages)
Free Space Page Map 4 bytes UINT 32 LE record pointer to a page bitmap for pages containing free space (for inserting rows).

for every real index :

Unknown A1 4 bytes ???
Index Row Count 4 bytes UINT 32 LE Unknown
Unknown A2 4 bytes ??? Jet 4 only, always 0

for every current column:

Type 1 byte UINT 8 The column type. Known types are:
  • 1 Boolean
  • 2 Integer, 8 bit
  • 3 Integer, 16 bit
  • 4 Integer, 32 bit
  • 5 Fixed Point Number, 64 bit (Money / Currency)
  • 6 Floating Point Number, 32 bit (single)
  • 7 Floating Point Number, 64 bit (double)
  • 8 Date/Time, 64 bit, (stored as double)
  • 9 Binary (up to 255 bytes)
  • 10 Text (up to 255 characters)
  • 11 OLE (long binary)
  • 12 Memo (long Text)
  • 15 GUID (global unique identifier)
  • 16 Fixed Point, 96 bit, stored in 17 bytes
  • 18 Complex field (32 bit integer key)
Unknown 4 bytes
Jet 4 only
UINT 32 LE Jet 4 only, first unknown definition block
Column Id 2 bytes UINT 16 LE A unique number for every column, doesn't change
Variable Column Number 2 bytes UINT 16 LE The offset in the variable length columns list
Column Index 2 bytes UINT 16 LE The index of this column in the table
Various 4 bytes (Jet 4)
6 bytes (Jet 3)
In Jet 3, textual columns:
  • 2 bytes collation (LCID = Localisation ID)
  • 2 bytes code page
  • 2 bytes unknown

Jet 3, decimal columns:

  • 2 bytes unknown
  • 1 byte Maximum total number of digits
  • 1 byte Number of decimal digits
  • 2 bytes unknown

Jet 4, textual columns:

  • 2 bytes collation
  • 1 byte unknown
  • 1 byte collation version number ?

Jet 4, decimal columns:

  • 1 byte Maximum total number of digits
  • 1 byte Number of decimal digits
  • 2 bytes unknown

Jet 4, complex columns:

  • 4 bytes, TDEF page number of the complex field
Column Flags 2 bytes (Jet 4)
1 byte (Jet 3)
UINT 16 LE 
UINT 8

Flags for various column properties:

  • 0x0001 fixed length column
  • 0x0002 can be null
  • 0x0004 autonumber
  • 0x0008 unknown
  • 0x0010 replication related field (or hidden?). These columns start with "s_" or "Gen_" (the "Gen_" fields are for memo fields)
  • 0x0020 unknown
  • 0x0040 automatic GUID
  • 0x0080 hyperlink. Syntax is "Link Title#http://example.com/somepage.html#" or "#PAGE.HTM#"
  • 0x0100 Compressed Unicode (Note: Memo is always compressed unicode!)
  • 0x1000 Marks the modern package type for OLE fields
  • 0x4000 Unknown, occurs together with 0x8000 in a formula type TEXT column
  • 0x8000 Unknown

Note: Complex fields always have 0x0007 as flag.

Unknown 4 bytes
(Jet 4 only)
??? Jet 4 only, usually 0
Fixed Offset 2 bytes UINT 16 LE The offset for fixed length columns
Length 2 bytes UINT 16 LE
  • field size for fixed width columns
  • maximum length for variable length columns,
  • 0 for MEMO and OLE
  • for every column:

    Name Length 2 bytes (Jet 4)
    1 byte (Jet 3)
    UINT 16 LE
    UINT 8
    The length in bytes of the column name
    Column name The column name in the default codepage (Jet3) or UCS-2 (Jet 4)

    for every real index:

    Unknown B1 4 bytes
    (Jet 4 only)
    ??? Jet 4 only
    Always one of two values:
    • 0
    • 1923

    for 10 times:

    Column Id 2 bytes UINT 16 LE The Column Id of the indexed column, or 0xFFFF if the index doesn't have so many columns
    Flags 1 byte UINT 8 Index column flags. Only known flag is 0x01 for ascending order.

    Unknown B2 4 bytes ???
    First index page 4 bytes UINT 32 LE Points to the first index page
    Flags 2 bytes (Jet 4)
    1 byte (Jet 3)
    UINT 16 LE
    UINT 8
    Index flags.
    • 0x01 Unique
    • 0x02 IgnoreNuls
    • 0x08 Required
    Unknown B3 4 bytes ??? Jet 4 only
    Unknown B4 4 bytes ??? Jet 4 only
    Always 0

    for every index (including those that aren't real):

    Unknown C1 4 bytes
    Jet 4 only
    UINT 32 LE Jet 4 only, first unknown definition block
    Always one of two values:
    • 0
    • 1625
    Index Number 4 bytes UINT 32 LE The number of the index (not necessarily sequential)
    Index Column Number 4 bytes UINT 32 LE An index into the index column list (????)
    Unknown C2 1 byte UINT 32 LE
    • 0x00 for real indices
    • 0x01 ??
    • 0x02 ??
    Unknown C3 4 bytes UINT 32 LE

    0xffffffff for real indices

    small integer (most often 2, only once 0, usually less than 30) otherwise

    Unknown C4 4 bytes ??? Always 0 for real indices
    Unknown C5 2 bytes ???

    Probably some flags

    Almost always 0x00000404 for real indices (single exception)

    Combinations of the following flags otherwise:

    • 0x0001 Unknown flag 1
    • 0x0100 Unknown flag 2
    Index Type 1 byte UINT 8
    • 0x00 normal index
    • 0x01 primary key index
    • 0x02 non-real index
    Unknown C6 4 bytes ???

    Jet 4 only

    Always 0

    for every index:

    Name Length 2 bytes (Jet 4)
    1 byte (Jet 3)
    UINT 16 LE
    UINT 8
    The length in bytes of the index name
    Column name The index name in the default codepage (Jet3) or UCS-2 (Jet 4)

    until column id is 0xFFFF:

    Column Id 2 bytes UINT 16 LE The Column Id of a variable length column
    Arg 1 4 bytes UINT 32 LE Some argument
    Arg 2 4 bytes UINT 32 LE Some other argument




    OLE fields

    The OLE fields should actually rather be called long binary fields. They can contain a number of different kinds of values. I know of the following:

    • Classic packages, as described elsewhere
    • Modern packages, found in Access 2007 and later, possibly marked by 0x1000 column flag
    • Binary LV data in the catalog

    Modern packages

    They start with an 8 byte header:

    Name Length Type Description
    Storage Type 4 bytes UINT 32 LE
    • 0 Uncompressed package
    • 1 Compressed package (DEFLATE)
    Uncompressed Size 4 bytes UINT 32 LE The uncompressed size of the package.

    The package then follows this header. If it is a compressed package, the bytes are compressed using the DEFLATE algorithm. Use the zlib library to uncompress them.

    The package consists of a short header and the payload:

    Name Length Type Description
    Package Header Length 4 bytes UINT 32 LE
      The total length of this header. The payload size is (uncompressed size) - (package header length)
    Unknown 4 bytes UINT 32 LE Contains the value 1
    Extension Length 4 bytes UINT 32 LE The length (in characters) of the file name extension / file type.
    Extension 2n bytes CHAR 16 LE The file name extension of the payload, zero terminated, in UCS-2 / UTF-16.

    The rest of the package is the payload, ie. the wrapped file.

    Table Properties (LvProp)

    Access stores various design view properties in the LvProp column of the MSysObjects table. This is the format of the data stored in this field.

    LvProp format

    Every LvProp field contains a 4 byte signature followed by a number of property blocks.

    Name Length Type Description
    LvProp Signature 4 bytes Jet 3: 'KKD\0'
    Jet 4: 'MR2\0'

    now a variable number of property blocks follows:

    Block Length 4 bytes UINT 32 LE

    Contains the length in bytes of this property block.

    Block Type 2 bytes UINT 16 LE

    The type of this property block.

    • 0x0000: Table Property Value Block
    • 0x0001: Column Property Value Block
    • 0x0080: Property Name Block
    Block Contents (Block Length)-6 bytes

    The contents of the property block, formatted according to the type field.

    Property Name Block

    There is exactly one block with a type 0x0080. This property block simply contains a list of strings. These strings are referenced by the other property blocks.

    Name Length Type Description
    Length 2 bytes UINT 16 LE
    Property Name (Length) bytes A property name.

    Property Value Blocks

    There is a property value block for the table itself (type 0x0000), and one for every column (type 0x0001). They are formatted as follows

    Name Length Type Description
    Header Length 4 bytes UINT 32 LE Length of this header
    Name Length 2 bytes UINT 16 LE Length of the following string in bytes
    Name (Name Length) bytes The name of the corresponding column, or empty string if this block is for the table.

    This is followed by a number of records containing the values stored in this property block (repeated until the end of the block):

    Length 2 bytes UINT 16 LE Length of this record in bytes
    Unknown 1 byte Some unknown flag
    Type 1 byte Type of this value
    Name index 2 bytes UINT 16 LE The name of this property. This is an index referencing to the 0x0080 block.
    Value Length 2 bytes UINT 16 LE The length of the following string in bytes.
    Value (Value Length) bytes The actual value of the property.

    相关评论
    不良信息举报中心成都网警网警110报警服务AAA级互联网行业信用360网站安全检测

    数据恢复QQ交流群:378664983    站长QQ:958754010


    蜀ICP备14015947号-2