Instructions for the use of CRYSOL

Below you will find short instructions how to use CRYSOL by Dmitri Svergun. Instructions for CRYSON are here. The information below corresponds to the file /hosts/lass2/d1/lss/svergun/Crysol/crysol.ins.

At the ILL, use the command crysol.

Back to the D22 documentation

Back to the LSS home page

The worldwide version on the ILL web is here.


                      *************************
                       *       CRYSOL        *
                      *************************

         A program  to  evaluate  X-ray  solution  scattering of 
         biological macromolecules from their atomic coordinates




      -----         Version 1.01   29.01.96 14:19      -----
      -----           for MS-DOS and UNIX              -----


                                                  
       Written by       D.Svergun") & C.Barberato ^) 
       ----------
                        EMBL c/o DESY  Notkestrasse 85     
                        D. 22603 Hamburg,  GERMANY         
                        Tel.       (+49) (0)40 89902 125   
                        Fax.       (+49) (0)40 89902 149
                        E-mail     SVERGUN@EMBL-Hamburg.DE

   ") On leave from:     Institute of Crystallography
                         Academy of Sciences of Russia
                         117333 Leninsky pr.,59 Moscow, Russia
                        
   ^) Present address:   University of Sao Paulo
                         Institute of Physics and Chemistry 
                         of Sao Carlos,  Caixa Postal 369 
                         SEP 13560 - Sao Carlos - SP, Brasil
                           
-----------------------------------------------------------------------

                                CONTENT


1)  INTRODUCTION

2)  INPUT DATA FILES

    2.1 PDB files
    2.2 Experimental data

3)  SAMPLE RUN

4)  READING ".SAV" FILES 

5)  READING PDB FILES

    5.1 Atoms
    5.2 Heteroatoms
    5.3 Water molecules
    5.4 Hydrogens

6)  OUTPUT FILES

7)  USEFUL FORMULAE

8)  PRACTICAL ADVICE

    8.1) Completeness of the atomic structure
    8.2) Maximum order of harmonics

9)  HARDWARE AND SOFTWARE REQUIREMENTS
   
    9.1) IBM-PC version
    9.2) UNIX   version

10) References

########################################################################


1) INTRODUCTION

CRYSOL is a program for evaluating solution scattering from 
macromolecules with known atomic structures [1]. The program uses 
multipole expansion of the scattering amplitudes to 
calculate the spherically averaged scattering pattern and 
takes into account the hydration shell. Given the atomic 
coordinates it can i) predict the solution scattering curve 
or ii) fit the experimental scattering curve using only two free 
parameters, the average displaced solvent volume per atomic group 
and the contrast of the hydration shell.  

#######################################################################

2)  INPUT DATA FILES

2.1) PDB files

CRYSOL reads the atomic coordinates of the structure in a 
Protein Data Bank format [2].

2.2) EXPERIMENTAL data

Optionally an experimental curve (non-smeared difference curve:  
solute-solvent) can be supplied as a sequential ASCII 
file containing the experimental data (difference curve, solute-solvent).
The first line is always  treated as a title. The following lines 
are treated as data  which should  contain  momentum transfer,  
non-zero intensity, and, optionally, standard deviation in a 
free format (that is, separated by blanks or commas). If standard
deviations are not present, the program will estimate the
errors  automatically  with  the  help  of a polynomial smoothing
procedure.
An error or end-of-file terminates the input stream. 
The maximum number of data points is 512, the rest (if any) will 
be ignored.

Two examples of input files are given below:

Example 1: Input data file with standard deviations
..........................................................................
Test 1
    .2266E-01    .4311E+03    .9046E+00
    .2370E-01    .4247E+03    .7163E+00
    .2475E-01    .4182E+03    .5839E+00
    .2579E-01    .4115E+03    .5003E+00
    .2683E-01    .4046E+03    .4543E+00
    .2787E-01    .3976E+03    .4365E+00
      ...
      ...
..........................................................................
Example 1: Input data file without standard deviations (Errors will be
automaticly evaluated).
..........................................................................
Test 2
.8247E-02   .8662E-05
.8513E-02   .8761E-05
.8779E-02   .8594E-05
.9045E-02   .8558E-05
.9311E-02   .8513E-05
.9577E-02   .8364E-05
.9843E-02   .8283E-05
.1011E-01   .8157E-05
.1038E-01   .8113E-05
.1064E-01   .8004E-05
 ...
 ...
-----------------------------------------------------------------------

Note that the momentum transfer is taken to be 

       s = 4 (pi) sin (theta) / Lambda,                          (1)

where theta is a half of the scattering angle and lambda the wavelength
in Angstroms. If the angular units in the input file are different, they 
will be rescaled to (1) (that is, multiplied by a factor) acccording to the 
following table                     
 
         Units in the input file                      Multiplier  
 s1= 4*pi*sin(theta) / Lambda  [angstrom]              1.000    (default)
 s2= 4*pi*sin(theta) / Lambda  [nm]                    0.100    
 s3= 2 *  sin(theta) / Lambda  [angstrom]              6.283    
 s4= 2 *  sin(theta) / Lambda  [nm]                    0.683    

########################################################################

3) SAMPLE RUN

The program usage is illustrated below in the sample CRYSOL run.  
The CRYSOL prompts are numbered below for reference (the numbers 
are not actually displayed). The lines starting with C=> are 
explanations and comments.  The default answers are given inside 
brackets and accepted by pressing .

C=>  From the DOS/UNIX prompt type:

C:\crysol

C=>  The following will be displayed :

     ------------------------------------------------ 
       C R Y S O L    Version 1.01 -- 29.01.96      
    Copyright 1996 ---  D.Svergun, C.Barberato & M.H.J.Koch
      ------------------------------------------------ 
                    Program options :       
     0 - evaluate scattering amplitudes and envelope 
     1 - evaluate only envelope and Flms 
     2 - read CRYSOL information from a .sav file 
     ------------------------------------------------ 

1-  Enter your option                   <         0 > : 

C=>     The default option (0) evaluates the scattering 
C=>     curve from a PDB file, optionally fits the experimental
C=>     data and saves the results.
C=>     Option (1) evaluates only the envelope function of the 
C=>     macromolecule and saves it onto the .flm file.
C=>     Option (2) retrieves the information previously
C=>     saved in the file with extension ".sav". This option  
C=>     can be used e.g. to compare the same atomic structure 
C=>     against several sets of experimental data.  

2-  Brookhaven file name                <     .pdb  > : 6lyz

C=>     This question has to be answered with the name of the file in
C=>     the PDB format [2]. If this file has the extension ".pdb"
C=>     (ex; 6lyz.pdb) it suffices to type only the name without
C=>     extension (6lyz). Other extensions must be typed explicitly.
   
3-   ------------------------------------------------ 
             Following file names will be used: 
    6lyz00.log -- CRYSOL log-file          (ASCII) 
    6lyz00.sav -- save CRYSOL information  (binary)
    6lyz00.flm --   multipole coefficients (ASCII)
    6lyz00.int --   scattering intensities (ASCII)
    6lyz00.fit -- fit to experimental data (ASCII)
    6lyz00.alm --   net partial amplitudes (binary)

C=>     Successive runs for the same PDB file will generate 
C=>     output files with increasing version names. For 
C=>     example, the next run for the file 6lyz.pdb will
C=>     generate names like 6lyz01.log, 6lyz01.sav, etc.
C=>     The current last version is checked only for the ".log"
C=>     file, i.e, if in the current directory there are the
C=>     following files: 6lyz00.log; 6lyz02.log and 6lyz10.log,
C=>     the next CRYSOL run will generate file names like 
C=>     6lyz11.log, 6lyz11.sav, etc.  
C=>     The size of the output file names is limited to 12 caracters.
C=>     If the PDB file name (without extension) is longer than 
C=>     5 characters the output file names will be truncated 
C=>     (Ex; The output file names for the PDB file atw2rsta.pdb 
C=>     will be atw2rs00.log, atw2rs00.sav, etc.).
C=>     The program also truncates the string "pdb" at the begining
C=>     of the PDB files (Ex; the output file names for a PDB
C=>     file like pdb6lyz.ent will be 6lyz00.log, 6lyz00.sav, etc.).   

     ------------------------------------------------
4-  Maximum order of  harmonics         <        12 > : 

C=>     Defines the resolution of the calculated curve. Higher
C=>     values give more accurate results but require more CPU
C=>     time, see 8.2.
     
5-  Order of Fibonacci grid             <        17 > : 

C=>     The order of Fibonacci grid defines the number of points
C=>     describing the surface of the macromolecule. The table below 
C=>     gives the relation between these numbers:
C=>
C=>     Fibonacci  order  10   11   12   13   14   15   16  17   18
C=>     Number of points  90  145  234  378  611  988 1598 2585 4182
C=>
C=>     Again, higher grid orders gives a more accurate surface 
C=>     representation at the expense of CPU time, see 8.3

6-   ----   Reciprocal space grid          ----------
     ( in s = 4*pi*sin(theta)/lambda [1/angstrom] ) 

    Maximum s value                     < 0.500     > : 

C=>    The maximum possible s is 1.0 (1/A) 
 
7-  Number of points                    <        51 > : 

C=>    Number of points in the theoretical curve (maximum = 101)

8-   Read atoms and evaluate geometrical center ... 
     Number of atoms read                              :   1001
     Number of discarded waters                        :   101

C=>    Water molecules and hydrogens are discarded, see 5) for details.

     Percent processed      10  20  30  40  50  60  70  80  90 100
     Processing atoms   : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
     Processing envelope: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

C=>    ">" symbols are printed indicating the status of the 
C=>    data processing. 

     Coefficients   saved to file 6lyz00.flm
     CRYSOL data    saved to file 6lyz00.sav

9-   * Warning:    1 atoms of unknown type were found *

C=>    This warning is printed when atoms of unknown
C=>    type are found. The lines containing these atoms are printed
C=>    in the "*.log" file. See section "Reading PDB files" for
C=>    more details.

10    --- Structural parameters (sizes in angstroms) --- 

     Electron   Rg   :  13.99       Envelope   Rg      :  14.01    
     Shape      Rg   :  13.97       Envelope  volume   : 0.1842E+05
     Shell    volume : 0.1158E+05   Envelope  surface  :  3232.    
     Shell      Rg   :  18.89       Envelope  radius   :  25.49    
     Shell    width  :  3.000       Envelope  diameter :  49.04    
     Molecular Weight: 0.1431E+05   Dry volume         : 0.1735E+05
     Displaced volume: 0.1738E+05   Average atomic rad.:  1.606    

C=>    Integral parameters of the macromolecule, 
C=>    see 7) for details.

11-  Fit the experimental curve           /N(o) :  

C=>    If the user answers "N", the theoretical curve will be 
C=>    predicted with the fixed parameters, then the program stops
C=>    (default values are recommended):

       12a- Contrast of the solvation shell    <   0.03     > :
       13a- Average atomic radius              <   1.61     > :         
            Intensities    saved to file 6lyz00.int
            Net amplitudes saved to file 6lyz00.alm
C=>
C=>    Program stops
C=>
C=>    If the answer is "Y", the experimental data will be fitted

12-  File name (experimental data)       <     .dat  > : lyzexp

C=>    The default extension is ".dat" (i.e. one can type lyzexp for 
C=>    lyzexp.dat), otherwise the complete name must be typed.

  Angular units in the input file:  
  4*pi*sin(theta)/lambda [1/angstrom] (1)
  4*pi*sin(theta)/lambda [1/nm]       (2)
  2 *  sin(theta)/lambda [1/angstrom] (3)
  2 *  sin(theta)/lambda [1/nm]  (4)     <     1    >  : 1

C=> Answer in which units the momentum transfed is specified
C=> in the input file, see 2.2.


13-  Title: Lysozyme, high angles (>.22) 46 mg/ml, small angles (<.22) 15 mg/ml 
     Number of experimental points found          196

C=>    The first line of the file with the experimental curve
C=>    and the number of experimental points are displayed.  
    
14-  Solvent density                     < 0.334     > : 

C=>    The default solvent density 0.334 electrons/Cubic Angstroms
C=>    is the electron density of pure water. Solvents with high 
C=>    salt concentration may have a somewhat higher electron density, 
C=>    which can be taken into account by the user.
 
15-   ------  Fitting the experimental data ...    --- 
     Lower plotting limit in log scale   < -2.33     > : 
 
C=>    CRYSOL performs a plain grid search in the range
C=>    0 < DRO < 0.09 e/A**3 and 1.58 < Ra < 1.65    
C=>    (default limits which can be changed by the user later, 
C=>    see 17b)
C=>    The net scattering intensity is automatically scaled
C=>    to the experimental data. The best fit is plotted 
C=>    on the terminal with the data normalised so that I(0)=1.
C=>    The user can choose the displayed range on log scale
C=>    manually (see the above prompt). Numerical values displayed 
C=>    on the plot are
C=>    Dro = contrast of the hydration shell [ e/A**3]
C=>    Ra  = average displaced solvent volume per atomic group
C=>    RGE = Experimental radius of gyration
C=>    RGT = Theoretical radius of gyration
C=>    Vol = Total displaced solvent volume
C=>    Chi = Chi-square between experimental and theoretical curve 
C=>
        Press CR to continue
C=>
C=>    Pressing  leaves the graphics mode, pressing
C=>    p (PC version) makes a hard copy on the printer 
C=>    see 9.2 for details. 
C=>

16-  Another parameters?                  Y(es)/ : 

     Data fit       saved to file 6lyz00.fit
     Intensities    saved to file 6lyz00.int
     Net amplitudes saved to file 6lyz00.alm
C=>
C=>    Program stops
C=>

16a- Another parameters?                  Y(es)/ : y

C=>    The parameters can be changed manually  
        
17-  Minimize again with new limits       Y(es)/ : n

18-  Contrast of the solvation shell     < 0.180E-01 > : 
19-  Average atomic radius               <  1.61     > :  
     Lower plotting limit in log scale   < -2.33     > : 

C=>    [ A new plot is displayed ]

     Press CR to continue
C=>
C=>    Pressing  leaves the graphics mode, pressing
C=>    p (in the PC version) makes a hard copy on the printer 
C=>    see 9.2 for details. 
C=>

16b- Another parameters?                  Y(es)/ : y

17b- Minimize again with new limits?      Y(es)/ : y

C=>
C=>   A new fit can performed using other limits
C=>   for the minimization procedure.
C=>
18b- Minimum radius of atomic group      <  1.58     > : 1.60
19b- Maximum radius of atomic group      <  1.65     > : 1.64
20-  Smax in the fitting range           < 0.496     > : .3
21-  Maximum contrast in the shell       < 0.900E-01 > : 0.05
     Number of experimental points used                : 111
     ------  Fitting the experimental data ...    --- 
     Lower plotting limit in log scale   < -2.33     > : 

C=>    [ A new plot is displayed ]

     Another parameters?                  Y(es)/ : 

     Data fit       saved to file 6lyz00.fit
     Intensities    saved to file 6lyz00.int
     Net amplitudes saved to file 6lyz00.alm
C=>
C=>    Program stops
C=>

###################################################################

4) READING ".SAV" FILES 

Once the structure has been evaluated, CRYSOL saves all the relevant 
information including scattering amplitudes, onto the ".sav" 
file, which saves the time in the successive calculations with 
the same PDB file. 

From the DOS/UNIX prompt type:

C:\crysol

     ------------------------------------------------ 
       C R Y S O L    Version 1.01 -- 29.01.96 
   Copyright 1996 ---  D.Svergun, C.Barberato & M.H.J.Koch
     ------------------------------------------------ 
                     Program options :       
     0 - evaluate scattering amplitudes and envelope 
     1 - evaluate only envelope and Flms 
     2 - read CRYSOL information from a .sav file 
     ------------------------------------------------ 

1-  Enter your option                   <         0 > : 2
2-  Save file name                      <     .sav  > : 6lyz00
     ------------------------------------------------ 
           Following file names will be used: 
    6lyz01.log --       CRYSOL log-file    (ASCII) 
    6lyz01.int --   scattering intensities (ASCII)
    6lyz01.fit -- fit to experimental data (ASCII)
    6lyz01.alm --   net partial amplitudes (binary)
     ------------------------------------------------ 
    Maximum order of harmonic                         : 12
    Order of Fibonacci grid                           : 17
    Total number of directions                        : 2585
    Maximum scattering angle                          : 0.5000
    Number of angular points                          : 51
    PDB file name                                     : 6lyz.pdb
    Number of atoms                                   : 1001
     --- Structural parameters (sizes in angstroms) --- 
    Electron   Rg   :  13.99       Envelope   Rg      :  14.01    
    Shape      Rg   :  13.97       Envelope  volume   : 0.1842E+05
    Shell    volume : 0.1158E+05   Envelope  surface  :  3232.    
    Shell      Rg   :  18.89       Envelope  radius   :  25.49    
    Shell    width  :  3.000       Envelope  diameter :  49.04    
    Molecular weight: 0.1431E+05   Dry volume         : 0.1735E+05
    Displaced volume: 0.1738E+05   Average atomic rad.:  1.606    
11- Fit the experimental curve           /N(o) : 

C=>     The above information is extracted from the .sav file
C=>     and CRYSOL goes directly to the question No 11
C=>     in the previous section 
        

###################################################################

5) READING PDB FILES

PDB files may have a broad variety of names for the atoms and residues
of the macromolecule. This section explains how the program 
extracts the type and the coordinates of each atom in the PDB file.

Nineteen atoms (C, N, O, Fe, etc.) and atomic groups (CH, CH2, NH, etc.) 
are recognized by the program. The internal variable IATYPE(j), which 
represents the j-th atom, can assume the following values:

Table 1: Possible values for IATYPE

Atomic Group     C   CH   CH2  CH3   N   NH   NH2  NH3   O   OH   
   IATYPE        1    2    3    4    5    6    7    8    9   10

Atomic Group     S   SH    P   FE   CU   CA   MG   MN   ZN
   IATYPE       11   12   13   14   15   16   17   18   19

For each line starting with the words "ATOM" and "HETATM" the 
variables CARD, ATYPE, RTYPE, X(j), Y(j) and Z(j) are read 
in the format (A6,6X,A3,2X,A3,10X,3F8.3), where CARD stands for the
word at the beginning of the line ("ATOM" or "HETATM"), ATYPE for
the atom type, RTYPE for the residual type and (XYZ) for the
coordinates of the j-th atom.


5.1  Regular Atoms


The "ATOMS" lines are supposed to contain only regular residues,
i.e, ATYPE and RTYPE can assume only the following symbols:

Table 2: Possible symbols for  ATYPE.
 N , CA, C , CB, CG, CD, CE, CZ, CH, N , ND, NE, NZ, NH, OG, OD,
 OE, OH, SG, SD or OX 

Table 3: Possible symbols for RTYPE.
 ALA,ARG,ASN,ASP,CYS,GLN,GLU,GLY,HIS,ILE,LEU,LYS,MET,PHE,PRO,SER,
 THR,TRP,TYR or VAL

The values of IATYPE for each pair ATYPE/RTYPE are shown in the table 4:

Table 4: ITYPE value for each pair atomic/residue symbol

    N  CA C  CB CG CD CE CZ CH N  ND NE NZ NH OG OD OE OH SG SD OX
ALA  6, 2, 1, 9, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9  ALA
ARG  6, 2, 1, 9, 3, 3, 3, 0, 1, 0, 0, 6, 0, 7, 0, 0, 0, 0, 0, 0, 9  ARG
ASN  6, 2, 1, 9, 3, 1, 0, 0, 0, 0, 7, 0, 0, 0, 0, 9, 0, 0, 0, 0, 9  ASN
ASP  6, 2, 1, 9, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 0, 9  ASP
CYS  6, 2, 1, 9, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,11, 0, 9  CYS
GLN  6, 2, 1, 9, 3, 3, 1, 0, 0, 0, 0, 7, 0, 0, 0, 0, 9, 0, 0, 0, 9  GLN
GLU  6, 2, 1, 9, 3, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 9  GLU
GLY  6, 3, 1, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9  GLY
HIS  6, 2, 1, 9, 3, 1, 2, 2, 0, 0, 5, 6, 0, 0, 0, 0, 0, 0, 0, 0, 9  HIS
ILE  6, 2, 1, 9, 2,-1, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9  ILE
LEU  6, 2, 1, 9, 3, 2, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9  LEU
LYS  6, 2, 1, 9, 3, 3, 3, 3, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 9  LYS
MET  6, 2, 1, 9, 3, 3, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,12, 9  MET
PHE  6, 2, 1, 9, 3, 1, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9  PHE
PRO  5, 2, 1, 9, 3, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9  PRO
SER  6, 2, 1, 9, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0,10, 0, 0, 0, 0, 0, 9  SER
THR  6, 2, 1, 9, 2, 4, 0, 0, 0, 0, 0, 0, 0, 0,10, 0, 0, 0, 0, 0, 9  THR
TRP  6, 2, 1, 9, 3, 1, 1, 2, 2, 2, 0, 6, 0, 0, 0, 0, 0, 0, 0, 0, 9  TRP
TYR  6, 2, 1, 9, 3, 1, 2, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0,10, 0, 0, 9  TYR
VAL  6, 2, 1, 9, 2, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9  VAL
    N  CA C  CB CG CD CE CZ CH N  ND NE NZ NH OG OD OE OH SG SD OX

Examples:

ATOM         CB  ALA            -9.164  20.763   8.633    !Example 1
For the above line in the input file;
CARD      = 'ATOM  '
ATYPE     = ' CB'
RTYPE     = 'ALA'
X(j)      =  -9.164
Y(j)      =  20.763
Z(j)      =   8.633
IATYPE(j) =  9      (see table 4 and 1)

ATOM         NH2 ARG           -12.600  15.351  -1.762    !Example 2
CARD      = 'ATOM  '
ATYPE     = ' NH'
RTYPE     = 'ARG'
X(j)      =  -12.600
Y(j)      =   15.351
Z(j)      =   -1.762
IATYPE(j) =  7        

Sometimes the input file has other symbols for ATYPE and RTYPE
for example;

ATOM        OT   LEU           -15.920  20.270   5.677     !Example 3
CARD      = 'ATOM  '
ATYPE     = ' OT'
RTYPE     = 'LEU'
X(j)      =  -15.920
Y(j)      =   20.270
Z(j)      =    5.677
IATYPE(j) =  0        

In this case the program writes this line in the ".log" file and displays 
a warning on the screen. However, it still tries to find the type of the
atom searching with only the two first characters of ATYPE (' O' for this
example) in the table 1. For the current example the final value of 
IATYPE will then be 9. 

If after this procedure the value of IATYPE is still zero, an oxygen
will be assigned to it (IATYPE(j) = 9).

5.2 Heteroatoms

For the "HETATM" lines, the value of IATYPE is extracted directly from
the table 1 using only the two first characters of ATYPE.

HETATM      FE   FE              1.437  16.676  19.902    !Example 4
CARD      = 'HETATM'
ATYPE     = 'FE'
RTYPE     = 'FE '
IATYPE(j) =  14  (See table 1) 

For the heteroatoms not contained in the table 1, an oxygen is assumed.

5.3 Water Molecules

CRYSOL uses a border layer (with thickness of 3 A and constant 
electron density) surrounding the surface of the macromolecule to 
simulate the hydration shell (for more details see [1]). The
crystallographically determined water molecules (residues 
'HOH' or 'WAT') are omitted in the input file. 

Sometimes, however, the authors put other names like
H2O, OHH etc. Such atoms may NOT be recognized and therefore will
be treated as default atoms (oxygens) belonging to the macromolecule
which may introduce systematic errors. We recommend to check 
the input file manually and cut out the non-standard waters.

5.4 Hydrogens

The hydrogen atoms are automatically taken into account in the atomic
groups (see 5.1) and thus should not be included in the PDB file.
Any ATOM/HETATM line in the input PDB file which has the string 'H'
in the 13 or 14th collumm will be discarted.

#####################################################################

6) OUTPUT FILES

   6lyz00.log -- CRYSOL log-file          (ASCII) 
    Contains the screen output and the detailed warning messages.
                                             
   6lyz00.sav -- save CRYSOL information  (binary)
    Contains the amplitudes among other necessary
    information to evaluate the intensities curves.

   6lyz00.flm --   multipole coefficients (ASCII)
    Contains the multipole coefficients Flm which describe
    the particle envelope (see 7).

   6lyz00.int --   scattering intensities (ASCII)
    Contains the theoretical intensities of the particle in
    solution, in vacuo, the solvent scattering and the 
    border layer scattering (see 7).

   6lyz00.fit -- fit to experimental data (ASCII)
    Contains the fit to the experimental data.

   6lyz00.alm --   net partial amplitudes (binary)
    It is the sum of the scattering amplitudes for atoms, 
    excluded volume and border layer (see 7)

#########################################################################

7) USEFUL FORMULAS

Schematic representation of the dissolved particle:


 . . . . . . .   wwwwwww
               wwooooooowww             * atoms
  . . . . .   wwo * *   ooow            o border atoms which form 
            wwo*  **  * *  ow             the particle envelope
 . . . . .  wo * *   * * * *oww           described by the three-dimensional
            wo**  *   *  * *  owww        angular function F(omega),
  SOLVENT   wo * ** *O--------->ooww      omega = (theta, phi) - direction 
  . . . . . wo ** *  * F(omega) * oow             in spherical coordinates
  (Ros)     wo * *    *     *  *   *ow   O is origin (Center of mass)
 . . . . .  wwoo   * *   oooooooooooww   w the solvation shell (thickness
              wwo *  * oowwwwwwwwwww       3 Angstroms assumed)
   . . . . . .  wooooowwww
                wwwwwww


The envelope function is represented as
    
                        L      l                  
            F(omega) = SUM    SUM   Flm * Ylm(omega)
                       l=0    m=-l
where
      Flm are the multipole coefficients (complex numbers)
      Ylm(omega) are spherical harmonics
      L the maximum order of harmonics (see prompt 4 in the sample run)

The scattering intensity is evaluated as

             L      l                  
     I(s) = SUM    SUM  [ Alm(s) - Ros*Clm(s) + Dro*Blm(s) ]**2  
            l=0    m=-l

where   Alm(s) are partial amplitudes of the atomic scattering 
        Blm(s) are partial amplitudes of the border scattering 
        Clm(s) are partial amplitudes of the shape  scattering 
        Ros    is the electron density of the solvent (e/A**3)
        Dro    is the contrast in the solvation shell (e/A**3)

I(s) depends on the two parameters: Dro and the average volume
per atomic group which modifies the Clm's.

The output file .INT contains four curves: 

(1) the difference intensity  I(s), 

(2) the atomic scattering 

             L      l                  
    Ia(s) = SUM    SUM  [ Alm(s) ]**2,  
            l=0    m=-l

(3) the shape scattering 

             L      l                  
    Ic(s) = SUM    SUM  [ Ros*Clm(s) ]**2,  
            l=0    m=-l

and (4) the border layer scattering 

             L      l                  
    Ib(s) = SUM    SUM  [ Dro*Blm(s) ]**2.  
            l=0    m=-l

The file .FIT contains the I(s) together with the experimental
curve.

Other useful formulas: 
 
     Molecular Weight         :      SUM(N) Mj                 

where SUM(N) is the sum of the N atoms/atomic groups of the
structure and Mj is the atomic mass of the j-th atom/atomic
group. 

                                      / SUM(N) Nej * Rj**2 \
     Electron   Rg            : SQRT |  ------------------  |  
                                      \        Nte         /
                      
where Nej and Rj are the number of electrons and the position
of the j-th atom (in relation to the geometric center) and Nte
is the total number of electrons in the protein.

                                      /  INT F(w)**2 dA  \
     Envelope   Rg            : SQRT |  ----------------  |    
                                      \      SURF        /
   
where INT is the integral of area, F(w) is the envelope
function (distance between the geometric center to the surface
of the protein) "w" represents the polar coordinates, "dA" the
area element and SURF the area of the protein surface. 

     Envelope  radius         :     The largest value of F(w).    
     Envelope  diameter       :     The largest distance between
                                    the points which describe
                                    the protein's surface.
                                        
     Estimated dry volume     :     0.73 * M / 6.023           

where M is the molecular weight.
                                    
     Van-der-Waals volume     :     SUM(N) Vj                  

where Vj is the solvent volume displaced by the j-th atom/atomic 
group.

                                    Van-der-Waals volume
     Average atomic radius    :      ------------------          
                                             N


######################################################################

8) PRACTICAL ADVICE

8.1) Completeness of the atomic structure;

It is very common to find incomplete structures in the PDB files.
The missing parts are usually flexible pieces of the protein's chain 
which are not visible in the crystal structure. These parts 
contribute, however, to the solution scattering pattern and, it is 
worthwhile to try to take them into account (e.g. by inserting dummy 
atoms into the input file)

8.2) Maximum order of harmonics
 
The maximum order of harmonics defines the resolution of the 
representation of the particle. 

          Resolution = Dmax * PI / ( 2 * L )                  (8.2.1)

where "Dmax" is the maximum dimension of the particle and L is 
the maximum used order of harmonics. Usually the default value
L=12 is sufficient to have a good precision in the range 
0 < s < 0.5 [1/A].

Note that high orders will give significant contributions
only at high values of momentum transfer (s). The following
formula relates the maximum order of harmonics with the maximum
s value:

            Lmax = PI * Dmax * smax  / 12                     (8.2.2) 

therefore, the evaluation of a SAXS curve up to 0.5 (1/A)
for lysozyme (Dmax = 45 A) requires, according to (8.2.2), L = 6. 
High orders for this example would not improve significantly
the results.

8.3 Order of the Fibonacci grid.

For the most practical cases, the 17th order of the grid is more
than sufficient. The 18th order is recommended only for L>12.

#################################################################

9)  HARDWARE AND SOFTWARE REQUIREMENTS

9.1 IBM-PC version.

The program is compiled using the Microsoft Fortran PowerStation 
1.0 with the Phar Lap MS-DOS extender and requires DOS 
version 3.3 or later, 2 Mb of free memory (Conventional + 
extended) and EGA/VGA/OVGA/SVGA video display.  An 8087/287/387/487 
mathematical co-processor is highly recommended.  To run the 
program, the DOS extender DOSXMSF.EXE must be in 
the current path/directory.

The font file COURB.FON  from the MS-Fortran distribution package  
is assumed to  be either in the working directory, or in the c:\, or
in c:\f32\lib\  directory.  If the file is not found, no text
will appear in graphic mode.

The build-in PrintScreen facility  does not require preliminary
loading  of  the   DOS  GRAPHICS. It  supports any
HP-compatible  printer  recognizing  HP  PCL  commands  (HP Laser
Jet,   Desk Jet,  etc ). If  your  printer  is NOT compatible with
HP-PCL,  use standard DOS PrintScreen  (then GRAPHICS
is necessary).

Note: after PrintScreen, no command "Eject" is send. The page  is
to be  ejected manually.  This allows  in some  cases, e.g.  when
printing with HP printers on DIN  A4 paper, to have two plots  on
one page.

9.2 UNIX version

CRYSOL is available for the following UNIX workstations:

SUN SPARC        under Solaris 2.4;
Silicon Graphics under Irix release 5.3;
DEC Alpha        under OSF Motif (beta-version).  

The Unix version of CRYSOL uses the public domain graphic 
package Gnuplot as a  graphic library.  by  creating  a sequence 
of data/command files in the /tmp directory which are then  
executed by Gnuplot. The software requirements are:

i)  Gnuplot  version  3.0  or higher must be installed and the 
Gnuplot executables should be in the PATH variable. If Gnuplot 
is not installed on your machine, CRYSOL will work without 
graphic output. Contact your system administrator (Gnuplot is a 
free public domain package).

ii) Environment variable GNUTERM  has to be defined  according to
the Gnuplot conventions.  If GNUTERM is  not defined, no  graphic
output is produced. 

iii)  If  the  environment  variable GNUPRINT is defined, CRYSOL
captures each  plot in  the format  of this  printer onto a fixed
disk file named /tmp/gnu-capture. This is made after the plot  is
viewed and CR is pressed. Note that this file will be OVERWRITTEN
by the next screen capture, therefore, if the user wishes to have
a hardcopy, the  file has to  be sent to  the printer or  renamed
BEFORE  pressing  CR  after  viewing  the  NEXT  plot. After CRYSOL
finishes, the file /tmp/gnu-capture contains the LAST plot. 
If GNUPRINT is not defined,  no  screen captures are produced.

Examples:

Specify

%setenv GNUTERM x11
%setenv GNUPRINT postscript

if you run CRYSOL in an X-window environment and want an
output for a generic postscript printer.

Specify

%setenv GNUTERM vttek
%setenv GNUPRINT hpljii

if you run CRYSOL on a VT-Tektronix emulation and want 
an output for the HP-Laserjet series II.

For a detailed description of the supported terminals and
printers, type 'set term' inside Gnuplot.


##################################################################

10)  References


1-  Svergun D.I., Barberato C. & Koch M.H.J. (1995) 
    J. Appl. Cryst., 28, 768-773. 
2-  Bernstein F.C. et al (1977). J. Mol. Biol., 112, 535-542.



Web document produced by Roland May, ILL Grenoble (last update: 26-Jan-2000)