Troubleshooting PDB files

The Protein Data Bank (PDB) contains a wealth of information on macromolecules and is operated by the Research Collaboratory for Structural Bioinformatics (RCSB). Files in PDB format can be imported and exported by other programs, including CrystalMaker.

Although the PDB file format seems relatively straightforward, there are some strict formatting rules, which require that certain data appear in certain columns. It has come to our attention that a number of programs - including software from the Cambridge Crystallographic Data Centre (CCDC) - may produce output files which do not ahhere to the PDB format, and which therefore cannot be read by other programs.

If you are having problems reading a PDB file into CrystalMaker, you should first check that this is a valid PDB file. Details of the PDB format are published by the RCSB. If you believe that the file does not adhere to the PDB formatting rules, please contact the file's creator (e.g., if the file was output by another program, contact the program's developers).

Misaligned Atom Names

One of the most common problems is the formatting for atom names. In the PDB format, atom names are composed of an atomic symbol (e.g., "C"), right-justified in columns 13-14 of ATOM and HETATM records, and trailing identifying characters (such as "A") left-justified in columns 15-16. Many programs simply left-justify the entire atom name starting in column 13.

Example: Consider the following two HETATM lines:

123456789012345678901234567890123456789012345678901234567890123456
            ||
HETATM    1 C22  UNK 0   1       5.929   7.113  -1.722  1.00  0.00
            ||
HETATM    1  C22 UNK 0   1       5.929   7.113  -1.722  1.00  0.00

The first example, generated by the CCDC, is incorrect. Notice that the element symbol is left-aligned with the "C" appearing in column 13. The second example is correct, because the element symbol right-aligned, with the "C" printed in column 14.

Troubleshooting CIF Files