Team:KAIST-Korea/Project/Modeling

From 2010.igem.org

Revision as of 06:48, 12 August 2010 by Luftschloss (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Modeling

Single chain antibody structural alignment

Single chain antibody structural alignment protocol

Protocol

There are four steps to compare structure of single chain antibody and FGF binding domain of FGFR. The first step is to take variable region sequences of antibodies. The next one is to combine these variable region sequences with a linker sequence to make single chain antibody sequence. Then, we predict the structure of single chain antibody with a structure prediction program like Modeller. Lastly, we structurally align these structures of antibodies with structure of FGF binding domain of FGFR (PDB ID: 1EVT).

Data source

Single chain antibody is the combination of variable regions of known antibodies with linker sequence, which can bind to the antigens. We need to know the VL and VH sequences to make single chain antibody. The sources of these antibody sequences include NCBI, Uniprot and RCSB PDB. NCBI and Uniprot provide the single chain sequence of variable regions (VL and VH) and antigen binding fragments (Fab). RCSB provide the structure of antigen binding fragment complexes that bind to their antigens. We, however, only need the sequence of variable region. So, we get the last 120~150 residues and assume them as the variable region. And, data from RCSB contain not only sequence of antibody but also antigens. Therefore, we filter them based on label of files to get heavy chains and light chains of antibody.

Single chain antibody synthesis

We combine the antibody variable region sequences in order of VH-linker-VL to make single chain antibody sequence. The sequence of the linker is GGGGSGGGGS.

Structure Prediction

We used the program called Modeller to predict the structure of single chain antibody from its sequence. Modeller predicts 3D structure of protein with structure of known similar proteins based on homology model. Input file is the sequence of single chain antibody in fasta format and output file is the structure of single chain antibody in pdb format.

Structure Alignment

In this step, we check the structural similarity between single chain antibodies and FGF binding domain to align the structure of single chain antibody with that of FGF binding domain of FGFR. The structure of FGF binding domain of FGFR is provided by RCSB PDB (PDB ID: 1EVT). We used Matt structural alignment program to do this job. Matt performs the structural alignment, which minimizes the distance between α-carbon chains of two proteins based on the common structure (α helix). Input file is the structure of single chain antibody in pdb format and output file is a text file that contains the number of amino acids, which are composed of shared structure (Core residue), average distance between alpha carbon chains of two proteins (Core RMSD), the score of similarity, which is calculated by Matt (Raw score), and the probability that this similarity is just a product of random (p-value) and pdb files, which contain the alignment result of single chain antibody with FGF binding domain of FGFR.

Result : Table

Name	Core Residues	Core RMSD	Raw Score	P-value
16A1	86	2.139	89.893	0.00443
2VXT	95	2.676	84.306	0.000222
2VXU	98	2.563	92.118	0.0000823
2VXV	77	2.64	52.269	0.0644
2ZKH	96	2.855	93.411	0.001
3AAZ	84	4.197	68.078	0.0642
3D69	137	12.224	98.134	0.00541
3EO9	80	2.282	84.665	0.00295
3EOA	116	10.402	98.544	0.0547
3EOB	68	4.685	56.357	0.3753
3EYV	85	4.704	66.669	0.1653
3FMG	81	2.629	73.298	0.0129
3FOG	90	7.243	74.031	0.0785
3G6D	72	4.403	69.774	0.1281
3GBM	119	10.331	115.929	0.0314
3GBN	77	2.788	57.204	0.0592
3GHB	101	12.861	81.189	0.3984
3GHE	96	3.302	109.208	0.00117
3GI8	127	6.837	113.467	0.00038
3GI9	52	3.534	43.56	0.166
3GIZ	89	8.217	68.702	0.3172
3GK8	91	6.84	84.048	0.00811
3GKW	74	3.205	47.096	0.1215
3GNM	135	4.739	103.812	0.00000297
3GO1	103	8.842	77.631	0.2587
3GRW	107	9.819	108.405	0.0449
3H42	69	4.389	59.165	0.3421
3HC0	67	3.065	40.982	0.1028
3HC3	67	3.173	49.109	0.0944
3HC4	36	1.998	42.538	0.206
3HI5	77	3.222	58.096	0.1832
3HI6	81	3.646	63.526	0.0394
3HMW	83	4.159	62.598	0.1976
3HMX	70	5.215	62.814	0.2228
3HNT	84	2.62	74.666	0.00118
3HNV	84	2.998	77.17	0.00374
3HR5	78	4.158	58.026	0.2978
3I50	65	2.206	52.252	0.1156
3I9G	84	5.026	64.155	0.1506
3IU3	75	3.192	56.006	0.2552
3IXT	78	2.951	68.627	0.0378
3KDM	87	4.273	65.722	0.1683
3KS0	81	3.039	55.562	0.0192
3KYK	88	4.728	68.681	0.1016
3KYM	85	6.52	66.102	0.5135
3L1O	93	2.545	85.909	0.0014
3L5W	89	4.925	72.178	0.0637
3L5X	88	4.796	65.262	0.1111
3L5Y	64	4.689	65.227	0.2007
3L95	73	5.105	70.312	0.0515
3LMJ	76	2.699	56.627	0.0237
3LQA	75	8.371	58.374	0.8315
3LS4	79	2.458	72.68	0.00694
3LS5	81	2.443	73.043	0.0014
3LZF	96	2.468	93.292	0.00155
3MLR	71	2.574	64.979	0.0165
3MLS	75	2.534	57.386	0.0202
3MLU	127	10.025	73.766	0.0305
3MLV	75	2.454	62.321	0.0218
3MLW	83	8.94	59.029	0.4846
3MLX	118	11.485	105.391	0.0961
3MLY	84	4.704	63.962	0.1023
3MLZ	82	3.995	64.563	0.123
3MUG	79	2.827	68.72	0.0223
3MXV	125	10.857	121.911	0.01
3MXW	102	11.262	78.348	0.4213
1DSE	33	4.821	25.400	0.8574
2KR2	75	4.13	43.618	0.2895
2WVN	19	2.445	11.423	0.7067
2WVP	57	6.861	52.39	0.6536
2X55	37	2.307	45.547	0.2647
2X56	37	2.269	44.949	0.2645
2X5E	45	3.922	22.009	0.636
2X6A	69	2.518	67.2	0.0277
2X6B	72	2.694	66.076	0.0433
2X6C	69	2.521	68.046	0.0277
2X6D	31	2.19	27.183	0.3866
2X7M	39	5.697	22.742	0.9606
2X8D	31	3.335	28.656	0.6961
2X9Z	79	3.027	70.217	0.0131
2XB7	44	5.444	34.426	0.746
2XCH	32	2.435	26.269	0.427
2XCK	34	3.479	33.038	0.6521
2XD6	30	2.056	29.429	0.3361
2XEY	29	2.732	28.271	0.6627
2XEZ	28	2.938	29.547	0.6961
2XF0	27	1.682	24.751	0.4138
2XF4	43	3.488	42.04	0.5361
2XGW	23	3.213	13.37	0.8533
2XIL	37	5.304	24.721	0.8651
2XJ5	37	5.302	24.937	0.865
2XJ8	41	5.224	24.707	0.9166
2XJG	54	7.366	41.923	0.899
2XJX	47	7.351	39.362	0.9192
2XL7	35	2.871	38.17	0.5818
3A5J	44	3.577	38.745	0.5856
3A5K	44	3.107	43.763	0.4649
3A7N	30	2.989	16.076	0.6424
3HJL	43	3.221	16.018	0.7916
3HN4	32	2.071	30.695	0.3804
3HRA	16	1.764	12.557	0.6734
3I0C	30	2.273	22.59	0.4501
3I0D	29	2.318	22.849	0.4659
3I0E	31	2.238	27.618	0.4123
3I0H	27	2.07	25.607	0.4119
3I0I	30	2.294	23.368	0.4412
3IG0	21	1.55	23.451	0.4946
3IJZ	32	2.372	24.986	0.4372
3IK0	32	2.375	25.483	0.4369
3IK1	32	2.366	24.831	0.4407
3K4Z	74	3.632	54.636	0.0337
3KWA	42	2.624	34.064	0.2795
3M0J	33	3.338	22.992	0.7442
3M0K	33	3.347	22.876	0.7458
3M2W	38	5.231	32.085	0.795
3M7G	21	1.537	15.84	0.5617
3MBL	40	3.433	34.5	0.3949
3MBR	36	4.071	34.958	0.7351
3MNR	37	3.391	28.647	0.6731
3MRW	28	1.956	30.63	0.3969
3MU7	29	2.567	21.007	0.4913
3MW3	47	2.665	38.629	0.3011
3MXT	62	2.828	34.447	0.2004
3MY3	28	2.995	10.878	0.5999
3N0R	33	3.11	18.853	0.6775
3N11	19	2.307	17.508	0.6986
3N4T	30	2.657	26.947	0.4295
3NE8	41	4.674	25.792	0.8496
3NGW	44	4.214	23.603	0.7468
3NIZ	41	3.035	32.651	0.3067
3NQK	98	4.418	64.835	0.0125
3NWO	35	2.127	30.301	0.3048
3NXH	38	2.224	38.17	0.3394
3O3T	38	2.854	23.219	0.4752
3O5X	65	10.8	44.035	0.9368
3O6C	41	3.624	22.934	0.6685
3O6D	41	3.602	23.293	0.6638

Name -- Except 16A1(Name of antibody itself), all name of single chain antibodies are come from PDB ID of its sources.
A row filled with blue is for our single chain antibody 16A1.
Rows filled with green are for single chain antibodies for comparison.
Rows filled with brown are for random proteins for comparison

Result : Figures

Using arranged pdb files and protein 3D structure drawing program - PyMOL, again, we predict structural similarity between the structure of FGF binding domain and that of single chain antibodies associated with the table. In order to show the structures clearly, we control the shape setting 'cartoon' and 'chain'. Cyan Color represents the structure of FGF binding domain and Green Color is for the comparative antibodies. In the result, we can easily see all antibodies' structures are very similar, completely or symmetrically.

Result Analysis

The table above shows the result of alignments of 65 single chain antibodies for control group and 1 single chain antibody for our real experiment (16A1). Higher core residue and Raw score, lower core RMSD, and p-value mean more similarity. 16A1 antibody shows 26th highest in core residue, 2nd lowest in core RMSD, 13th highest Raw score and 13th lowest p-value. These result means that our 16A1 antibody has higher similarity than average antibodies have.

In general, random selected small proteins have lower Core RMSD value than single chain antibodies. It maybe confusing, but lower Core RMSD is caused by RMSD calculation of Matt. Matt calculate RMSD for not whole proteins, but shared structures. So random selected small proteins with small shared structure, have lower Core RMSD. Therefore, we should check whether protein pairs with lower Core RMSD have enough number of Core residues.