Team:KAIST-Korea/Project/StructureAlignment

From 2010.igem.org

 

Structure Alignment



Single chain antibody structural alignment

Single chain antibody structural alignment protocol

Protocol

There are four steps to compare structure of single chain antibody and FGF binding domain of FGFR. The first step is to take variable region sequences of antibodies. The next one is to combine these variable region sequences with a linker sequence to make single chain antibody sequence. Then, we predict the structure of single chain antibody with a structure prediction program like Modeller. Lastly, we structurally align these structures of antibodies with structure of FGF binding domain of FGFR (PDB ID: 1EVT).

Data source

Single chain antibody is the combination of variable regions of known antibodies with linker sequence, which can bind to the antigens. We need to know the VL and VH sequences to make single chain antibody. The sources of these antibody sequences include NCBI, Uniprot and RCSB PDB. NCBI and Uniprot provide the single chain sequence of variable regions (VL and VH) and antigen binding fragments (Fab). RCSB provide the structure of antigen binding fragment complexes that bind to their antigens. We, however, only need the sequence of variable region. So, we get the last 120~150 residues and assume them as the variable region. And, data from RCSB contain not only sequence of antibody but also antigens. Therefore, we filter them based on label of files to get heavy chains and light chains of antibody.

Single chain antibody synthesis

We combine the antibody variable region sequences in order of VH-linker-VL to make single chain antibody sequence. The sequence of the linker is GGGGSGGGGS.

Structure Prediction

We used the program called Modeller to predict the structure of single chain antibody from its sequence. Modeller predicts 3D structure of protein with structure of known similar proteins based on homology model. Input file is the sequence of single chain antibody in fasta format and output file is the structure of single chain antibody in pdb format.

Structure Alignment

In this step, we check the structural similarity between single chain antibodies and FGF binding domain to align the structure of single chain antibody with that of FGF binding domain of FGFR. The structure of FGF binding domain of FGFR is provided by RCSB PDB (PDB ID: 1EVT). We used Matt structural alignment program to do this job. Matt performs the structural alignment, which minimizes the distance between α-carbon chains of two proteins based on the common structure (α helix). Input file is the structure of single chain antibody in pdb format and output file is a text file that contains the number of amino acids, which are composed of shared structure (Core residue), average distance between alpha carbon chains of two proteins (Core RMSD), the score of similarity, which is calculated by Matt (Raw score), and the probability that this similarity is just a product of random (p-value) and pdb files, which contain the alignment result of single chain antibody with FGF binding domain of FGFR.

Result : Table

  • Name -- Except 16A1(Name of antibody itself), all name of single chain antibodies are come from PDB ID of its sources.
  • A row filled with blue is for our single chain antibody 16A1.
  • Rows filled with green are for single chain antibodies for comparison.
  • Rows filled with brown are for random proteins for comparison
Name Core ResiduesCore RMSDRaw ScoreP-value
16A1 86 2.139 89.893 0.00443
2VXT 95 2.676 84.306 0.000222
2VXU 98 2.563 92.118 0.0000823
2VXV 77 2.64 52.269 0.0644
2ZKH 96 2.855 93.411 0.001
3AAZ 84 4.197 68.078 0.0642
3D69 137 12.224 98.134 0.00541
3EO9 80 2.282 84.665 0.00295
3EOA 116 10.402 98.544 0.0547
3EOB 68 4.685 56.357 0.3753
3EYV 85 4.704 66.669 0.1653
3FMG 81 2.629 73.298 0.0129
3FOG 90 7.243 74.031 0.0785
3G6D 72 4.403 69.774 0.1281
3GBM 119 10.331 115.929 0.0314
3GBN 77 2.788 57.204 0.0592
3GHB 101 12.861 81.189 0.3984
3GHE 96 3.302 109.208 0.00117
3GI8 127 6.837 113.467 0.00038
3GI9 52 3.534 43.56 0.166
3GIZ 89 8.217 68.702 0.3172
3GK8 91 6.84 84.048 0.00811
3GKW 74 3.205 47.096 0.1215
3GNM 135 4.739 103.812 0.00000297
3GO1 103 8.842 77.631 0.2587
3GRW 107 9.819 108.405 0.0449
3H42 69 4.389 59.165 0.3421
3HC0 67 3.065 40.982 0.1028
3HC3 67 3.173 49.109 0.0944
3HC4 36 1.998 42.538 0.206
3HI5 77 3.222 58.096 0.1832
3HI6 81 3.646 63.526 0.0394
3HMW 83 4.159 62.598 0.1976
3HMX 70 5.215 62.814 0.2228
3HNT 84 2.62 74.666 0.00118
3HNV 84 2.998 77.17 0.00374
3HR5 78 4.158 58.026 0.2978
3I50 65 2.206 52.252 0.1156
3I9G 84 5.026 64.155 0.1506
3IU3 75 3.192 56.006 0.2552
3IXT 78 2.951 68.627 0.0378
3KDM 87 4.273 65.722 0.1683
3KS0 81 3.039 55.562 0.0192
3KYK 88 4.728 68.681 0.1016
3KYM 85 6.52 66.102 0.5135
3L1O 93 2.545 85.909 0.0014
3L5W 89 4.925 72.178 0.0637
3L5X 88 4.796 65.262 0.1111
3L5Y 64 4.689 65.227 0.2007
3L95 73 5.105 70.312 0.0515
3LMJ 76 2.699 56.627 0.0237
3LQA 75 8.371 58.374 0.8315
3LS4 79 2.458 72.68 0.00694
3LS5 81 2.443 73.043 0.0014
3LZF 96 2.468 93.292 0.00155
3MLR 71 2.574 64.979 0.0165
3MLS 75 2.534 57.386 0.0202
3MLU 127 10.025 73.766 0.0305
3MLV 75 2.454 62.321 0.0218
3MLW 83 8.94 59.029 0.4846
3MLX 118 11.485 105.391 0.0961
3MLY 84 4.704 63.962 0.1023
3MLZ 82 3.995 64.563 0.123
3MUG 79 2.827 68.72 0.0223
3MXV 125 10.857 121.911 0.01
3MXW 102 11.262 78.348 0.4213
1DSE 33 4.821 25.400 0.8574
2KR2 75 4.13 43.618 0.2895
2WVN 19 2.445 11.423 0.7067
2WVP 57 6.861 52.39 0.6536
2X55 37 2.307 45.547 0.2647
2X56 37 2.269 44.949 0.2645
2X5E 45 3.922 22.009 0.636
2X6A 69 2.518 67.2 0.0277
2X6B 72 2.694 66.076 0.0433
2X6C 69 2.521 68.046 0.0277
2X6D 31 2.19 27.183 0.3866
2X7M 39 5.697 22.742 0.9606
2X8D 31 3.335 28.656 0.6961
2X9Z 79 3.027 70.217 0.0131
2XB7 44 5.444 34.426 0.746
2XCH 32 2.435 26.269 0.427
2XCK 34 3.479 33.038 0.6521
2XD6 30 2.056 29.429 0.3361
2XEY 29 2.732 28.271 0.6627
2XEZ 28 2.938 29.547 0.6961
2XF0 27 1.682 24.751 0.4138
2XF4 43 3.488 42.04 0.5361
2XGW 23 3.213 13.37 0.8533
2XIL 37 5.304 24.721 0.8651
2XJ5 37 5.302 24.937 0.865
2XJ8 41 5.224 24.707 0.9166
2XJG 54 7.366 41.923 0.899
2XJX 47 7.351 39.362 0.9192
2XL7 35 2.871 38.17 0.5818
3A5J 44 3.577 38.745 0.5856
3A5K 44 3.107 43.763 0.4649
3A7N 30 2.989 16.076 0.6424
3HJL 43 3.221 16.018 0.7916
3HN4 32 2.071 30.695 0.3804
3HRA 16 1.764 12.557 0.6734
3I0C 30 2.273 22.59 0.4501
3I0D 29 2.318 22.849 0.4659
3I0E 31 2.238 27.618 0.4123
3I0H 27 2.07 25.607 0.4119
3I0I 30 2.294 23.368 0.4412
3IG0 21 1.55 23.451 0.4946
3IJZ 32 2.372 24.986 0.4372
3IK0 32 2.375 25.483 0.4369
3IK1 32 2.366 24.831 0.4407
3K4Z 74 3.632 54.636 0.0337
3KWA 42 2.624 34.064 0.2795
3M0J 33 3.338 22.992 0.7442
3M0K 33 3.347 22.876 0.7458
3M2W 38 5.231 32.085 0.795
3M7G 21 1.537 15.84 0.5617
3MBL 40 3.433 34.5 0.3949
3MBR 36 4.071 34.958 0.7351
3MNR 37 3.391 28.647 0.6731
3MRW 28 1.956 30.63 0.3969
3MU7 29 2.567 21.007 0.4913
3MW3 47 2.665 38.629 0.3011
3MXT 62 2.828 34.447 0.2004
3MY3 28 2.995 10.878 0.5999
3N0R 33 3.11 18.853 0.6775
3N11 19 2.307 17.508 0.6986
3N4T 30 2.657 26.947 0.4295
3NE8 41 4.674 25.792 0.8496
3NGW 44 4.214 23.603 0.7468
3NIZ 41 3.035 32.651 0.3067
3NQK 98 4.418 64.835 0.0125
3NWO 35 2.127 30.301 0.3048
3NXH 38 2.224 38.17 0.3394
3O3T 38 2.854 23.219 0.4752
3O5X 65 10.8 44.035 0.9368
3O6C 41 3.624 22.934 0.6685
3O6D 41 3.602 23.293 0.6638



Result : Figures

Using arranged pdb files and protein 3D structure drawing program - PyMOL, again, we predict structural similarity between the structure of FGF binding domain and that of single chain antibodies associated with the table. In order to show the structures clearly, we control the shape setting 'cartoon' and 'chain'. Cyan Color represents the structure of FGF binding domain and Green Color is for the comparative antibodies. In the result, we can easily see all antibodies' structures are very similar, completely or symmetrically.





Result Analysis

The table above shows the result of alignments of 65 single chain antibodies for control group and 1 single chain antibody for our real experiment (16A1). Higher core residue and Raw score, lower core RMSD, and p-value mean more similarity. 16A1 antibody shows 26th highest in core residue, 2nd lowest in core RMSD, 13th highest Raw score and 13th lowest p-value. These result means that our 16A1 antibody has higher similarity than average antibodies have.

In general, Core RMSD value of randomly selected small protein (colored brown) is lower than that of single chain antibody (colored green). It may seem confusing, but lower Core RMSD is caused by RMSD calculation of Matt. Matt calculates RMSD for not whole proteins, but only shared structures. Therefore, we should check if protein pairs with lower Core RMSD have enough number of Core residues.