Original Research Open Access Logo

Immunological and mutational analysis of SARS-CoV-2 structural proteins from Asian countries

Deepak Kumar Jha 1 ORCID logo
Niti Yashvardhini 2, * ORCID logo
Amit Kumar 3
  1. Department of Zoology, P. C. Vigyan Mahavidyalaya, Chapra, 841 301, Bihar, India
  2. Department of Microbiology, Patna Women’s College, Patna, 800 001, Bihar, India
  3. Department of Botany, Patna University, Patna-800 005, Bihar, India
Correspondence to: Niti Yashvardhini, Department of Microbiology, Patna Women’s College, Patna, 800 001, Bihar, India. ORCID: https://orcid.org/0000-0001-9732-0251. Email: nitiyashvardhini@gmail.com.
Volume & Issue: Vol. 8 No. 5 (2021) | Page No.: 4367-4381 | DOI: 10.15419/bmrat.v8i5.675
Published: 2021-05-31

Online metrics


Statistics from the website

  • HTML Views: 6593
  • PDF Views: 1544
  • XML Views: 0

Statistics from Dimensions

This article is published with open access by BioMedPress. This article is distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0) which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited. 

Abstract

Introduction: The emergence of a novel coronavirus, SARS-CoV-2, an etiologic agent of coronavirus disease (COVID-19), has led to a pandemic of global concern. Considering the huge number of morbidity and mortality worldwide, the World Health Organization declared, on 11th March 2020, the pandemic as an unprecedented public health crisis. The virus is a member of plus sense RNA viruses that can show a high rate of mutations. The ongoing multiple mutations in the structural proteins of coronavirus drive viral evolution, enabling them to evade the host immunity and rapidly acquire drug resistance. In the present study, we focused mainly on the prevalence of mutations in the four types of structural proteins- S (spike), E (envelope), M (membrane), and N (nucleocapsid)- that are required for the assembly of a complete virion particle. Further, we estimated the antigenicity and allergenicity of these structural proteins to design and develop a potentially good candidate vaccine against SARS-CoV-2.

Methods: In the present in silico study, envelope protein was found to be highly antigenic, followed by the nucleocapsid, membrane, and spike proteins of SARS-CoV-2.

Results: In this study, we detected 987 mutations from 729 sequences from Asia in October 2020, and compared them with China's first Wuhan isolate sequence as a reference. Spike protein showed the highest mutations with 807 point mutations among the four structural proteins, followed by nucleocapsid with 151 mutations, while envelope showed 19 mutations and membrane only 10 point mutations.

Conclusion: Taken together, our study revealed that variations occurring in the structural protein of SARS-CoV-2 might be altering the viral structure and function, and that the envelope protein appears to be a promising vaccine candidate to curb coronavirus infections.

Introduction

Human Coronavirus (SARS-CoV-2, Severe acute respiratory syndrome) is a positive-sense RNA virus. As an etiologic agent of coronavirus disease 2019 (COVID-19), the virus induces moderate to severe respiratory distress1. This pandemic originated from an animal market in Wuhan city of China2. The ripple effect of this contagious viral disease has created a humanitarian health crisis and has become an enormous challenge to the entire health systems across the globe.

SARS-CoV-2 is a member of the Coronaviridae family and Nidovirales order. The virus is considered the third zoonotic coronavirus (after SARS-CoV and MERS-CoV) and originated from bats. However, this novel coronavirus has been the only one having pandemic potential3, 4, 5, 6. SARS-CoV-2, a beta coronavirus, is an enveloped single-stranded, positive-sense, non-segmented and genetically diverse RNA virus with the largest genome size among known RNA viruses (29,891 ase pair, encodes for approximately 9860 amino acids)2, 7, 8. The genome of SARS-CoV-2 encodes both structural proteins like spike (S), envelope (E), membrane (M), and nucleocapsid (N), as well as non-structural proteins ranging from NSP1 to NSP16.

RNA viruses, generally, show a drastically high rate of mutation, substantially higher than those of DNA viruses. Due to this high rate of mutation shown by SARS-CoV-2 over a short period, it has been observed that viruses exhibit genomic variability which enables them to modulate virulence properties in the host and subsequently evade the host immunity9, 10.

In the present research work, we detected 987 mutations from 729 sequences derived from Asia in in the October. Altogether spike showed the highest mutations with 807 point mutations among the four structural proteins, followed by nucleocapsid with 151 mutations. Envelope showed 19 mutations and membrane showed only 10 point mutations. The results of our study suggest that mutational analysis of this virus might be considered as a new approach to help understand its genomic variability. Similarly, using the predictive tools of immunoinformatics approach, the antigenicity and allergenicity of the structural proteins of SARS-CoV-2 have been determined to develop efficacious antiviral therapeutics or vaccines against COVID-19.

Methods

Data mining

The full-length protein sequences of SARS-CoV-2 structural proteins, ., envelope protein, nucleocapsid phosphoprotein, surface glycoprotein and membrane glycoprotein, were retrieved from the NCBI virus database, as submitted from Asia in the month of October. There were 729 SARS-CoV-2 structural protein sequences submitted from Asia in the month of October, including sequences of 165 envelope proteins, 159 nucleocapsid phosphoproteins, 246 surface glycoproteins, and 159 membrane glycoproteins. A total of four reference sequences for envelope protein (YP_009724392), nucleocapsid phosphoprotein (YP_009724397), surface glycoprotein (YP_009724390), and membrane glycoprotein (YP_009724393) were also retrieved for mutational studies.

Multiple sequence alignment (MSA) and mutational identification

Multiple sequence alignment was performed using Clustal Omega online platform (http://www.clustal.org/) based on HMM profile seeded guide trees11. The envelope, nucleocapsid phosphoprotein, surface glycoprotein, and membrane glycoprotein were aligned with their respective reference sequences. The aligned files were viewed using Jalview (https://www.jalview.org/) to identify the point mutations occurring in different structural proteins with respect to the Wuhan type isolate.

Antigenicity and allergenicity evaluation

Vaxijen v2.0 server was used for the estimation of antigenicity of all the four structural proteins to study the capability of structural proteins to be used in vaccine production. This online server predicts antigens as per the auto cross-covariance (called ACC transformation) of the peptide sequences submitted to it12. A good vaccine needs to be non-allergenic to the host, hence the rationale for evaluating the allergenicity of these structural proteins, AllerTOP server was used, which predicts allergenicity based on size, flexibility, and other parameters13.

Figure 1

Showing the total number of mutations occurring in the structural proteins.a. Surface glycoprotein, b. Envelope protein, c. Membrane glycoprotein and, d. Nucleocapsid phosphoprotein.

Table 1

Mutational location after Multiple Sequence Alignment of SARS-CoV-2 envelope protein sequence with position and sequence

Serial No.

Accession

Mutated sequence and position

1.

BCM16104

S68F

2.

BCM16116

S68F

3.

BCM16128

S68F

4.

BCM16176

S68F

5.

BCM16188

S68F

6.

BCM16200

S68F

7.

BCM16212

S68F

8.

BCM16140

S68F

9.

QOP57282

V75F

10.

QOP57300

V75F

11.

QOP57289

V75F

12.

QOP57280

V75F

13.

QOP57294

V75F

14.

QOS50800

V75F

15.

QOS50895

I46V

16.

QOS50728

V75F

17.

QOS50501

V75F

18.

QOU99241

I46V

19.

QOU99253

I46V

Table 2

Mutational location after Multiple Sequence Alignment of SARS-CoV-2 nucleocapsid phosphoprotein sequence with position and sequence

Serial No.

Accession

Mutated sequence and position

1.

QJF74875

R203K

2.

QJF74875

G204R

3.

QKM75385

R203K

4.

QKM75385

G204R

5.

QKM75397

R203K

6.

QKM75397

G204R

7.

QKM75409

R203K

8.

QKM75409

G204R

9.

QKM75421

R203K

10.

QKM75421

G204R

11.

QKM75433

R203K

12.

QKM75433

G204R

13.

QKM75445

P207L

14.

QKM75445

M210I

15.

QKM75505

R203K

16.

QKM75505

G204R

17.

QKM75505

D377G

18.

QKM75517

R203K

19.

QKM75517

G204R

20.

QKM75517

D377G

21.

QKM75529

R203K

22.

QKM75529

G204R

23.

QKM75529

D377G

24.

QKM75541

G204R

25.

QKM75541

D377G

26.

QKM75541

R203K

27.

QKM75552

R203K

28.

QKM75552

G204R

29.

QKM75552

D377G

30.

QKM75563

R203K

31.

QKM75563

G204R

32.

QKM75563

D377G

33.

QKM75575

R203K

34.

QKM75575

G204R

35.

QKM75587

R203K

36.

QKM75587

G204R

37.

QKM75599

R203K

38.

QKM75599

G204R

39.

QKM75647

R203K

40.

QKM75647

G204R

41.

QKM75659

R203K

42.

QKM75659

R204R

43.

QKM75683

R203K

44.

QKM75683

G204R

45.

QKM75695

R203K

46.

QKM75695

G204R

47.

QKQ30536

R203K

48.

QKQ30536

G204R

49.

QKQ30548

R40C

50.

QKQ30560

R203K

51.

QKQ30560

G204R

52.

QKQ30572

R203K

53.

QKQ30572

G204R

54.

QKQ30584

R203K

55.

QKQ30584

G204R

56.

QLA10246

R203K

57.

QLA10246

G204R

58.

QLA10270

R203K

59.

QLA10270

G204RR

60.

QLA10282

R203K

61.

QLA10282

G204R

62.

QLA10294

P383L

63.

QLA10294

R203K

64.

QLA10294

G204R

65.

QLA10306

R203K

66.

QLA10306

G204R

67.

QLA10318

G204R

68.

QLA10318

R203K

69.

QLA10330

G204R

70.

QLA10330

R203K

71.

QLA10342

G204R

72.

QLA10342

R203K

73.

QLA10354

R203K

74.

QLA10354

G204R

75.

QOI53600

P13L

76.

QOQ57020

S194L

77.

QOQ57032

S194L

78.

QOQ57044

S194L

79.

QOQ57056

S194L

80.

QOQ57068

S194L

81.

QOQ57092

M234I

82.

QOQ57104

S194L

83.

QOQ57116

S194L

84.

QOQ57129

S194L

85.

QOQ72552

S194L

86.

QOQ72564

S194L

87.

QOQ72576

S194L

88.

QOQ84803

S194L

89.

QOQ84834

S194L

90.

QOR63442

T205I

91.

QOR63454

S194L

92.

QOR63466

T205I

93.

QOR63514

A119S

94.

QOR63514

S194L

95.

QOR64241

S194L

96.

QOR64253

S194L

97.

QOS50459

P13L

98.

QOS50495

T91I

99.

QOS50507

P13L

100.

QOS50519

P13L

101.

QOS50531

P13L

102.

QOS50590

P13L

103.

QOS50650

P13L

104.

QOS50674

P13L

105.

QOS50686

P13L

106.

QOS50686

D225Y

107.

QOS50722

P13L

108.

QOS50734

P13L

109.

QOS50746

P13L

110.

QOS50746

S413I

111.

QOS50758

S413I

112.

QOS50758

P13L

113.

QOS50770

P13L

114.

QOS50782

P13L

115.

QOS50818

P13L

116.

QOS50830

P13L

117.

QOS50853

P13L

118.

QOS50865

P13L

119.

QOS50889

Q9H

120.

QOS50889

P199S

121.

QOS50901

S202N

122.

QOS50924

S202N

123.

QOS50948

P13L

124.

QOS50960

P13L

125.

QOS50972

P13L

126.

QOS50996

P13L

127.

QOS51008

P13L

128.

QOS51020

P13L

129.

QOS51032

P13L

130.

QOS51068

R209I

131.

QOS51068

P367L

132.

QOS51080

R203K

133.

QOS51080

G204R

134.

QOS51092

P13L

135.

QOS51104

R203K

136.

QOS51104

G204R

137.

QOU99154

P14L

138.

QOU99201

Q9H

139.

QOU99201

P199S

140.

QOU99223

Q9H

141.

QOU99223

P199S

142.

QOU99247

S202N

143.

QOU99259

S202N

144.

QOU99270

Q9H

145.

QOU99270

P199S

146.

QOU99281

Q9H

147.

QOU99281

P199S

148.

QOU99292

Q9H

149.

QOU99292

P199S

150.

QOU99303

Q9H

151.

QOU99303

P199S

Results

Mutational identification

A total of 729 structural protein sequences were retrieved from the NCBI virus database for spike glycoproteins, nucleocapsid phosphoproteins, envelope proteins, and membrane glycoproteins submitted from Asian countries in the month of October 2020, along with four references sequences. The size of the different reference structural proteins, ., spikes glycoprotein, nucleocapsid phosphoprotein, envelope protein, and membrane glycoprotein being 1273, 419, 75, and 222 amino acids.

The sequences were viewed using Jalview after alignment to compare and detect the mutations among the Asian isolates with the Wuhan isolates with respect to structural proteins. Amongst the 729 sequences released from Asia, a total of 987 point mutations were detected in all four structural proteins (Figure 1). Among the 311 mutants, spike showed the highest mutations with 807 point mutations (Table 3), followed by nucleocapsid with 151 mutations (Table 2), while envelope showed 19 mutations (Table 1) and membrane showed only 10 point mutations (Table 4).

Table 3

Mutational location after Multiple Sequence Alignment of SARS-CoV-2 surface glycoprotein sequence with position and sequence

S. No.

Accession

Mutated sequence and position

1.

QJF74843

V367F

2.

QJF74867

D614G

3.

QOI53592

M153I

4.

QMI57728

T95I

5.

QMI57728

N185K

6.

QOI53580

D614G

7.

QOR64233

D614G

8.

QOR64245

D614G

9.

QOQ57012

D614G

10.

QOQ57024

D614G

11.

QOQ57060

D614G

12.

QOR64233

A701T

13.

QOR64233

P812L

14.

QOR64245

P812L

15.

QOQ57012

P812L

16.

QOQ57060

P812L

17.

QOR64233

H1083Q

18.

QOR64245

H1083Q

19.

QOQ57012

H1083Q

20.

QOQ57024

H1083Q

21.

QOQ57072

D614G

22.

QOQ57084

D614G

23.

QOQ57096

D614G

24.

QOQ57108

D614G

25.

QOQ57121

D614G

26.

QOQ57108

A701T

27.

QOQ57072

A701T

28.

QOQ57096

P812L

29.

QOQ57121

P812L

30.

QOQ57096

H1083Q

31.

QOQ57121

H1083Q

32.

QOQ57108

H1083Q

33.

QOQ72544

L54F

34.

QOQ72556

L54F

35.

QOQ72568

L54F

36.

QOQ72544

D614G

37.

QOQ72556

D614G

38.

QOQ72568

D614G

39.

QOQ72580

D614G

40.

QOQ84795

D614G

41.

QOQ72544

A701T

42.

QOQ72556

P812L

43.

QOQ72568

P812L

44.

QOQ72580

P812L

45.

QOQ72544

H1083Q

46.

QOQ84826

D614G

47.

QOR63434

D614G

48.

QOR63446

D614G

49.

QOR63458

D614G

50.

QOR63470

D614G

51.

QOR63434

P812L

52.

QOR63470

P812L

53.

QOQ53335

S305T

54.

QOQ53335

C488R

55.

QOR63482

D614G

56.

QOQ57036

D614G

57.

QOQ57048

D614G

58.

QOR63506

D614G

59.

QOQ53335

D614G

60.

QOQ57048

A701T

61.

QOQ57036

P812L

62.

QOQ57048

P812L

63.

QOQ57036

H1083Q

64.

QOQ57048

H1083Q

65.

QOQ53339

F2L

66.

QOQ53339

V11I

67.

QOQ53339

S13R

68.

QOQ53339

Q14H

69.

QOQ53339

R34H

70.

QOQ53339

V42I

71.

QOQ53339

R44K

72.

QOQ53339

V47I

73.

QOQ53339

F59I

74.

QOQ53339

K77N

75.

QOQ53339

D111N

76.

QOQ53339

Q115H

77.

QOQ53339

A123T

78.

QOQ53339

N487I

79.

QOQ53339

V512L

80.

QOQ53339

A522P

81.

QOQ53339

A262T

82.

QOQ53339

Q677H

83.

QOQ53336

G199R

84.

QOQ53340

A262T

85.

QOQ53338

C301S

86.

QOQ53340

R328T

87.

QOQ53337

R457T

88.

QOQ53338

D614G

89.

QOQ53340

D614G

90.

QOQ53336

A684V

91.

QOQ53336

A688P

92.

QOQ53336

V705I

93.

QOQ53337

H1048Y

94.

QOQ53337

Q1180H

95.

QOQ53337

K1181Q

96.

QOQ53341

V11I

97.

QOL24227

V11I

98.

QOQ53341

R44K

99.

QOQ53341

V47I

100.

QOL24227

K77N

101.

QOQ53341

K77N

102.

QOQ53341

K97N

103.

QOQ53341

D111N

104.

QOL24227

D111N

105.

QOL24227

R190K

106.

QOL24227

D198E

107.

QOL24225

E224K

108.

QOL24225

D228N

109.

QOL24226

E224K

110.

QOL24226

D228N

111.

QOL24227

A262T

112.

QOQ53341

Q271H

113.

QOQ53341

F275L

114.

QOL24228

V407L

115.

QOL24228

P412S

116.

QOL24227

D427H

117.

QOL24227

N440H

118.

QOL24227

Q474P

119.

QOL24228

D614G

120.

QOL24227

D614G

121.

QOQ53341

G669R

122.

QOQ53341

Q675R

123.

QOQ53341

Q677H

124.

QOL24227

S686I

125.

QOL24227

A688P

126.

QOL24225

K790Q

127.

QOL24226

K790Q

128.

QOL24225

R815K

129.

QOL24226

R815K

130.

QOL24225

D820N

131.

QOL24226

D820N

132.

QOL24225

D830N

133.

QOL24226

D830N

134.

QOL24228

P863H

135.

QOL24241

F2L

136.

QOL24241

V11I

137.

QOL24241

Q14H

138.

QOL24241

R34C

139.

QOL24241

Y37N

140.

QOL24241

V42I

141.

QOL24241

R44K

142.

QOL24241

F65I

143.

QOL78311

S94F

144.

QOL78311

T95P

145.

QOL24240

D111N

146.

QOL24240

A282T

147.

QOL78311

D568H

148.

QOL78311

D614G

149.

QOL24241

H655Y

150.

QOL24240

Q675RR

151.

QOL79057

S13N

152.

QOL79057

D40E

153.

QOL79057

V42L

154.

QOL79057

S161F

155.

QOL79057

S246N

156.

QOL79057

D614G

157.

QOL79058

D614G

158.

QOL79058

R1019K

159.

QOL79058

P1090L

160.

QOL79059

V11F

161.

QOL79059

R21K

162.

QOL79135

R21K

163.

QOL79135

A222V

164.

QOL79061

K529I

165.

QOL79059

D614G

166.

QOL79135

D614G

167.

QOL79061

E619K

168.

QOL79061

G652R

169.

QOL79061

Q677H

170.

QOL79061

Y695N

171.

QOL79061

V729A

172.

QOL79136

D614G

173.

QOL79137

V11I

174.

QOL79137

Q115H

175.

QOL79137

D614G

176.

QOL79137

P863H

177.

QOL79137

Q913H

178.

QOL79137

I934T

179.

QOL79333

C136W

180.

QOL79333

N137Y

181.

QOL79333

I203L

182.

QOL21485

H207P

183.

QOL20612

E224K

184.

QOL21486

E224K

185.

QOL21486

R237K

186.

QOL21486

F238V

187.

QOL21486

Q239P

188.

QOL20612

T240S

189.

QOL79332

A262T

190.

QOL79333

L252P

191.

QOL79332

D467N

192.

QOL21486

A475V

193.

QOL20612

Q506H

194.

QOL20612

V510E

195.

QOL20612

V512E

196.

QOL20612

D614G

197.

QOL21485

V511I

198.

QOL79333

D568H

199.

QOL79332

Q675R

200.

QOL20612

V826G

201.

QOL21485

V826G

202.

QOL20612

I844M

203.

QOL21486

I844F

204.

QOL21486

R847K

205.

QOL21486

D848N

206.

QOL21486

C851W

207.

QOL20612

R847K

208.

QOL20612

D848N

209.

QOL21535

T21I

210.

QOL21535

V42A

211.

QOL21535

V511E

212.

QOL21535

C525W

213.

QOL21535

K557R

214.

QOL21535

R567K

215.

QOL21535

N657I

216.

QOL21535

Q677H

217.

QOL21535

V722A

218.

QOL21535

D737H

219.

QOL21535

T768P

220.

QOL21535

G769E

221.

QOL21535

E780K

222.

QOL21535

V781F

223.

QOL21535

K790Q

224.

QOL21535

I834J

225.

QOL21536

E224K

226.

QOL21536

V826G

227.

QOL21536

Y837N

228.

QOL21536

R847K

229.

QOL21536

D848N

230.

QOL21536

C851W

231.

QOL21536

L858F

232.

QOL21681

D228N

233.

QOL21681

E780K

234.

QOL21681

D808N

235.

QOL21681

S816P

236.

QOL21681

D820N

237.

QOL21720

T22I

238.

QOL21720

V213G