JDS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Farrell, H. M.
Right arrow Articles by Swaisgood, H. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Farrell, H. M., Jr.
Right arrow Articles by Swaisgood, H. E.
J. Dairy Sci. 87:1641-1674
© American Dairy Science Association, 2004.

Nomenclature of the Proteins of Cows’ Milk—Sixth Revision

H. M. Farrell, Jr.1, R. Jimenez-Flores2, G. T. Bleck3, E. M. Brown1, J. E. Butler4, L. K. Creamer5, C. L. Hicks6, C. M. Hollar7, K. F. Ng-Kwai-Hang8 and H. E. Swaisgood9

1 US Department of Agriculture, Eastern Regional Research Center, Wyndmoor, PA 19038
2 Department of Food Science, California Polytechnic State University, San Luis Obispo 93407
3 Gala Design, Middleton, WI 53562
4 Department of Microbiology, School of Medicine, University of Iowa, Iowa City 52240
5 Fonterra Research Centre, Palmerston North, New Zealand
6 Department of Animal Science, University of Kentucky, Lexington 40546
7 Masterfoods USA, Burr Ridge, IL 60527
8 Department of Animal Science, McGill University, Sainte Anne de Bellevue, PQ, Canada H9X 3V9
9 Department of Food Science, North Carolina State University, Raleigh 27695

Corresponding author: H. M. Farrell, Jr.; e-mail: hfarrell{at}errc.ars.usda.gov.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 THE CASEINS
 THE WHEY PROTEINS
 REFERENCES
 
This report of the American Dairy Science Association Committee on the Nomenclature, Classification, and Methodology of Milk Proteins reviews changes in the nomenclature of milk proteins necessitated by recent advances of our knowledge of milk proteins. Identification of major caseins and whey proteins continues to be based upon their primary structures. Nomenclature of the immunoglobulins consistent with new international standards has been developed, and all bovine immunoglobulins have been characterized at the molecular level. Other significant findings related to nomenclature and protein methodology are elucidation of several new genetic variants of the major milk proteins, establishment by sequencing techniques and sequence alignment of the bovine caseins and whey proteins as the reference point for the nomenclature of all homologous milk proteins, completion of crystallographic studies for major whey proteins, and advances in the study of lactoferrin, allowing it to be added to the list of fully characterized milk proteins.

Key Words: milk protein • structure • nomenclature • review

Abbreviation key: C = constant, EIMS = electrospray ionization MS, H = heavy chain, HSA = human SA, L = light chain, LF = lactoferrin, MS = mass spectroscopy, NMR = nuclear magnetic resonance, SA = serum albumin, V = variable


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 THE CASEINS
 THE WHEY PROTEINS
 REFERENCES
 
The initial report of the American Dairy Science Association Committee on the Nomenclature, Classification, and Methodology of Milk Proteins (Jenness et al., 1956) was an attempt to clarify the nomenclature of milk proteins by "presenting a summary of preferred usage and by showing the relationship between the individual proteins, which had been isolated, and the classical fractions." Subsequently, this Committee has published a revision of milk protein nomenclature approximately every 5 to 10 yr to summarize more recent findings (Table 1Go) and to suggest changes in nomenclature where appropriate. The intent of this Committee is to suggest a flexible nomenclature system that allows for incorporation of new discoveries rather than to suggest prematurely a rigid system of nomenclature. Since the last report of this Committee (Eigel et al., 1984), the most significant findings related to nomenclature and protein methodology are as follows.


View this table:
[in this window]
[in a new window]
 
Table 1. Proteins of bovine milk and some of their properties.1
 


    THE CASEINS
 TOP
 ABSTRACT
 INTRODUCTION
 THE CASEINS
 THE WHEY PROTEINS
 REFERENCES
 
Caseins in milk of the genus Bos were defined originally by this Committee (Jenness et al., 1956) as those phosphoproteins that precipitate from raw skim milk by acidification to pH 4.6 at 20°C. In a subsequent report (Whitney et al., 1976), the Committee differentiated CN according to their relative electrophoretic mobility in alkaline polyacrylamide or starch gels containing urea with or without mercaptoethanol. In the previous report, we recommended that use of electrophoresis as a basis for classification be dropped and that CN be identified according to the homology of their primary structures (amino acid sequences) into the following families: {alpha}s1-, {alpha}s2-, ß-, and {kappa}-CN. This recommendation is affirmed, and researchers are requested to refrain from assigning specific genetic variant letters to new variants until their sequence homology can be established. Individual members of these families still can be identified by gel electrophoretic techniques, some of the more effective of which are suggested in the monograph by this Committee (Swaisgood et al., 1975).

{alpha}S1-CN
The {alpha}Sl-CN family, which constitutes up to 40% of the CN fraction in bovine milk, consists of one major and one minor component. Both proteins are single-chain polypeptides with the same amino acid sequence established by Mercier et al. (1971) and Grosclaude et al. (1973) and differ only in their degree of phosphorylation. The minor component contains one additional phosphorylated serine residue at position 41 (Eigel et al., 1984). The reference protein for this family is {alpha}S1-CN B-8P, a single-chain protein with no cysteinyl residues. It consists of 199 amino acid residues: Asp7, Asn8, Thr5, Ser8, Ser P8, Glu25, Glnl4, Prol7, Gly9, Ala9, Val11, Met5, Ile11, Leul7, Tyrl0, Phe8, Lysl4, His5, Trp2, and Arg6 with a calculated molecular weight of 23,615 (Mercier et al., 1971). Its primary sequence is given in Figure 1Go; its ExPASy entry name and file number are CAS1_Bovin and P02662, respectively. Since the last nomenclature report (Eigel et al., 1984), 3 new genetic variants of {alpha}S1-CN have been identified. They are {alpha}s1-CN F (Erhardt, 1993), which was found in German Black and White cattle; {alpha}S1-CN G (Mariani et al., 1995) discovered in Italian Brown cows; and the H variant (Mahé et al. 1999). Hence, this family of proteins is currently known to consist of variant A found in Holstein Friesians, Red Holsteins, and German Red cattle (Ng-Kwai-Hang et al., 1984; Grosclaude, 1988; Erhardt, 1993); variant B, which is the predominant variant in Bos taurus (Eigel et al., 1984); variant C in Bos indicus and Bos grunniens (Eigel et al., 1984); variant D in various breeds in France (Grosclaude, 1988) and Italy (Mariani and Russo, 1975) as well as in Jerseys in The Netherlands (Corradini, 1969); and variant E in Bos grunniens (Grosclaude et al., 1976) in addition to the new variants F, G, and H.



View larger version (28K):
[in this window]
[in a new window]
 
Figure 1. The primary structure of {alpha}S1-CN B-8P (Mercier et al., 1971; Grosclaude et al., 1973; Stewart et al., 1984; Nagao et al., 1984; Koczan et al., 1991). Amino acid deletion or substitutions for genetic variants A, C, D, E, F, G, and H, respectively, are indicated in Table 2Go. Sites of post-translational phosphorylation (SeP) are indicated in italicized, boldface type. The underline indicates the location of another phosphorylation site in a minor species of this protein ({alpha}S1-CN B9-P).

 
The primary structure of {alpha}S1-CN B is given in Figure 1Go, and the deletion or substitutions for its genetic variants are given in Table 2Go. The structure of {alpha}S1-CN B was determined by amino acid sequencing (Mercier et al., 1971; Grosclaude et al., 1973) and confirmed by cDNA sequencing (Nagao et al., 1984; Stewart et al., 1984) and by sequencing of the genomic DNA (Koczan et al., 1991). The {alpha}S1-CN signal peptide is composed of 15 amino acid residues, making the pre-form of {alpha}S1-CN B 214 amino acids in length. Variant A uniquely arises as a result of exon skipping caused by a single-base mutation that affects splicing of the pre-mRNA (Mohr et al., 1994). Deletion of residues 14 through 26 was first identified by amino acid sequencing (Grosclaude et al., 1970) and later confirmed by cDNA sequencing (McKnight et al., 1989). Neither the primary structures nor nucleic acid sequences of variants F and G have been completely reported.


View this table:
[in this window]
[in a new window]
 
Table 2. Positions and amino acid differences in genetic variants of milk proteins.
 
The secondary structure of {alpha}S1-CN has been examined by various methods including CD spectroscopy, Raman spectroscopy, and predictive algorithms using sequence information, and the results have been reviewed previously (Swaisgood, 1992). However, its 3-D structure cannot be determined because the protein does not form crystals. Nuclear magnetic resonance (NMR) studies have also proven to be problematic because of the intrinsic aggregation of the protein. Nevertheless, its tertiary structure has been predicted using a combination of predicted secondary structures, adjusted to conform to the amount of global secondary structures determined experimentally, with molecular-modeling computations based on energy minimization (Kumonsinski et al., 1994). The latter structure should be viewed as a working model, which is consistent with bulk properties of the protein; it represents one possible interpretation of its structure.

Since the discovery of genetic variants, attempts have been made to correlate milk characteristics or milk production with the genotype. However, the correlations obtained have not been straightforward, in part because of differences in the parameters used. For example, the {alpha}S1-CN BB phenotype has been correlated with higher milk yields and, thus, higher protein yield over the lactation, (Ng-Kwai-Hang et al., 1984; Aleandri et al., 1990; Sang et al., 1994), but the same phenotype has also been correlated with lower protein concentration in milk (Ng-Kwai-Hang et al., 1986, 1992). It appears that cows carrying the G allele produce less {alpha}S1-CN and more of the other caseins (Mariani et al., 1995). For example, homozygous (GG) cows produce 55% less {alpha}S1-CN.

Because of the 13-amino acid residue deletion, the A variant’s characteristics are most different from the other variants (Farrell et al., 1988). Thus, most of the hydrophobic residues in the N-terminal region are eliminated, including the Phe-Phe-Val sequence that is cleaved by chymosin during cheese ripening (Mulvihill and Fox, 1979). Hence, {alpha}S1-CN A is similar to the peptide {alpha}S1-I corresponding to {alpha}S1-CN (f25-199), which has a reduced hydrophobicity (Creamer et al., 1982) and does not aggregate as extensively in the presence of calcium (Kaminogawa et al., 1980). The changes in curd rheology that occur with this proteolysis of the B variant are consistent with the observation that soft curds are formed with milks containing {alpha}S1-CN A (Sadler et al., 1968).

Comparison of the properties of variants B and C has indicated that {alpha}S1-CN C self-associates more strongly (Schmidt, 1970; Swaisgood, 1973), and cheeses made from milks containing the latter form a tougher curd (Sadler et al., 1968).

The distinct regions of anionic clusters and hydrophobicity evident in the primary structure are suggestive of the formation of hydrophobic and polar domains (Swaisgood, 1982, 1992) and are consistent with observed physical-chemical properties, such as the strong dependence of association on concentration, pH, ionic strength, and ion binding. The characteristics and significance of calcium ion binding to the anionic clusters are well known, but it has also been found that Zn2+ (Singh et al., 1989) and Fe (III) (Reddy and Mahoney, 1991) bind at these sites. The effect of these interactions on micelle structure and stability is not known.

{alpha}S2-CN
The {alpha}S2-CN family, which constitutes up to 10% of the CN fraction in bovine milk, consists of 2 major and several minor components exhibiting varying levels of post-translational phosphorylation (Swaisgood, 1992) and minor degrees of intermolecular disulfide bonding (Rasmussen et al., 1992). The predominant forms in bovine milk contain an intramolecular disulfide bond and differ only in their degree of phosphorylation. The reference protein for this family is {alpha}S2-CN A-11P, a single-chain polypeptide with an internal disulfide bond. It consists of 207 amino acid residues: Asp4, Asnl4, Thrl5, Ser6, Ser P11, Glu24, Gln16, Pro10, Gly2, Ala8, Cys2, Val14, Met4, Ile11, Leu13, Tyr12, Phe6, Lys24, His3, Trp2, and Arg6 with a calculated formula molecular weight of 25,226. The primary structure of this protein is given in Figure 2Go; its ExPASy entry name and file number are CAS2_Bovin and P02663, respectively. The secondary structure of {alpha}S2-CN has recently been studied by CD and FTIR spectroscopies (Hoagland et al., 2001).



View larger version (30K):
[in this window]
[in a new window]
 
Figure 2. The primary structure of the {alpha}S2-CN A-11P (Brignon et al., 1977; Mahé and Grosclaude, 1982; Stewart et al., 1987; Groenen et al., 1993). Seryl residues (SeP) identified as phosphorylated in {alpha}S2-CN A-11P are indicated in italicized, boldface type. Residues that have been determined to be partially phosphorylated or that potentially may be phosphorylated according to CN kinase specificity are underlined. Amino acid deletions or substitutions for genetic variants are given in Table 2Go.

 
The genetic variants, identified in the fifth revision of the Nomenclature report (Eigel et al., 1984), are {alpha}S2-CN A, B, C, and D. Upon alkaline urea-gel electrophoresis, these proteins migrate between the {alpha}S1- and ß-CN, and the most prevalent species, {alpha}S2-CN A-11P, has served as the reference band for all proteins in the casein pattern (Whitney et al., 1976). The A variant is most frequently observed in Western breeds, with {alpha}S2-CN D observed with frequencies of 0.01 to 0.09 in Vosgienne and Montbeliarde breeds (Grosclaude et al., 1978) and in 3 Spanish breeds (Osta et al., 1995b). The B variant was observed with low frequencies in zebu cattle in South Africa and, variant C was observed in yaks in the Nepalese valley and the Republic of Mongolia (Grosclaude et al., 1976, 1982).

The primary structure of {alpha}S2-CN A-11P (Figure 2Go), reported by Brignon et al. (1977), has been changed to Gln at position 87 rather than Glu, as indicated by cDNA sequencing (Stewart et al., 1987) and genomic DNA sequencing (Groenen et al., 1993). The {alpha}s2-CN signal peptide is composed of 15 amino acid residues, making the pre-form 222 amino acid residues in length. The D variant differs from {alpha}S2-CN A by the deletion of 9 amino acid residues from positions 51 to 59. However, the genomic DNA sequence does not reveal a deletion, but rather a substitution, suggesting that the amino acid sequence deletion is caused by the skipping of exon VIII, a 27-nucleotide sequence that encodes amino acid residues 51 to 59 (Bouniol et al., 1993). As shown in Table 2Go, the C variant differs from the A variant at positions 33, 47, and 130 (Mahé and Grosclaude, 1982). As the specific sites of mutation resulting in {alpha}S2 -CN B have not been identified, as shown in Table 2Go. Because of the progress made on this protein, following the elucidation of its sequence, {alpha}S2-CN will be reviewed in more detail in this report. Post-translational phosphorylation, primarily at seryl residues, results in the incorporation of 10 to 13 phosphate moeities. According to the specificity of CN kinase, phosphorylation occurs at Ser/Thr residues in the sequence Ser/Thr-X-Glu/SerP/Asp; however, the sequence SerX-Glu/SerP is heavily favored (Mercier, 1981). Only seryl residues are phosphorylated in {alpha}S2-CN A-11P, but Thr-66 was partially phosphorylated in {alpha}S2-CN C (Mahe and Grosclaude, 1982). Those residues known to be phosphorylated in {alpha}S2-CN A-11P are indicated by boldface italics in the figure. The underlined residues indicate potential sites of phosphorylation suggested by the enzyme specificity. It should be noted that Thr-47 in {alpha}S2-CN C is a potential phosphorylation site.

Another post-translational change that occurs with this protein is the formation of disulfide bonds. The 2 cysteinyl residues of this protein participate in both intramolecular and intermolecular disulfide bonds (Rasmussen et al., 1992, 1994). The protein exists predominantly as a monomer (>85%) with a disulfide bond between Cys residues 36 and 40 (Rasmussen et al., 1994) or as a dimer with both parallel and antiparallel disulfide bonds (Rasmussen et al., 1992). Therefore, 2 types of dimers are found: one fraction with residues 36 and 40 in one chain linked to residues 36 and 40, respectively, in the other chain. But, in another fraction, residues 36 and 40 are linked to residues 40 and 36, respectively, in the other chain. These results suggest that the formation of these bonds is not important to any structure required by this protein for its interaction with other CN.

{alpha}S2-Casein is the most hydrophilic of all caseins as a result of the 3 clusters of anionic groups composed of phosphoseryl and glutamyl residues. Although relatively hydrophobic, the C-terminal 47 residues carry a net positive charge (about +9.5) at the pH of milk (Swaisgood, 1992). On the other hand, the more hydrophilic N-terminal 68 residues contain 2 anionic clusters and exhibit a net charge of about –21 at the prevalent pH of milk. Hence, the primary structure of {alpha}S2-CN can be represented by 4 domains: an N-terminal hydrophilic domain with anionic clusters, a central hydrophobic domain, followed by another hydrophilic domain with anionic clusters, and finally a C-terminal positively charged hydrophobic domain (Swaisgood, 1992). This structure is consistent with an association behavior that is very dependent on ionic strength (Snoeren et al., 1980). The association appears to be strongest around an ionic strength of 0.2 M, with dissociation occurring in lower salt because of electrostatic repulsion and also in higher salt because of suppression of electrostatic attraction, thus, reflecting the contributions of both hydrophobic interactions and electrostatic attraction.

The number of anionic clusters and the hydrophilic nature is also reflected in calcium-binding properties of {alpha}S2-CN. For example, the latter protein is more sensitive to Ca2+ than {alpha}S1-CN (Toma and Nakai, 1973), with almost complete precipitation occurring in 2 mM Ca2+ for {alpha}S2-CN at pH 7; whereas, precipitation of {alpha}S1-CN requires 6 mM Ca2+ (Aoki et al., 1985). These properties also led to a method for fractionation of {alpha}S2-CN from other caseins by precipitation from propan-l-ol solutions (Vreeman and van Riel, 1990). Solubility in this solvent is governed by electrostatic interactions that are most prevalent in {alpha}S2-CN.

{alpha}S2-Casein appears to be readily susceptible to proteolysis as assessed by the activities of chymosin and plasmin toward the protein. Chymosin activity was observed in the regions of residues 88 to 98 and 164 to 180, but its primary cleavage occurred at Phe 88-Tyr 89 (McSweeney et al., 1994). These 2 regions are, respectively, at the edge of the central hydrophobic domain or in the first part of the cationic hydrophobic C-terminal domain. Plasmin activity released a number of peptides, including the N-terminal 21 to 24 residues of the initial hydrophilic domain containing one of the anionic clusters (Le Bars and Grippon, 1989; Visser et al., 1989). In agreement with plasmin specificity, mostly Lys-X bonds were cleaved at varying rates (Lys residues 21, 24, 149, 150, 181, 188, and 197). In addition to the shorter N-terminal peptides, a major peptide released was {alpha}S2-CN (fl51-207) (Le Bars and Grippon, 1989). In this regard, it is interesting to note that recently {alpha}S2-CN (fl65-203) was isolated from milk and shown to have antibacterial activity (Zucht et al., 1995).

ß-CN
The ß-CN family, which constitutes up to 45% of the casein of bovine milk is quite complex because of the action of the native milk protease plasmin (Eigel et al., 1984). Plasmin cleavage leads to formation of {gamma}1-, {gamma}2-. and {gamma}3-CN, which are actually fragments of ß-CN consisting of residues 29-209, 106-209, and 108-209. In addition, polypeptides previously called proteose peptone components 5, 8-fast, and 8-slow are fragments of ß-CN, which represent residues 1-105 or 1-107, 1-28, and 29-105, respectively. The reference protein for this family, ß-CN A2-5P is a single-polypeptide chain with no Cys residues containing 209 residues. It consists of Asp4, Asn5, Thr9, Ser11, Ser P5, Glul9, Gln20, Pro35, Gly5, Ala5, Val19, Met6, Ilel0, Leu22, Tyr4, Phe9, Lys11, His5, Trp1, and Arg4 with a calculated molecular weight of 23,983. The most common variant used as reference is variant A2; its ExPASy entry name and file number are CASB_Bovin and P02666, respectively. The A2 variant has been chemically sequenced (Ribadeau-Dumas et al., 1972) and sequenced from its cDNA (Jimenez-Flores et al., 1987; Stewart et al., 1987) and its gene (Bonsing et al., 1988). The ß-CN signal peptide is composed of 15 amino acid residues, making the pre-form 224 amino acids in length.

The sequence shown for ß-CN A2 in Figure 3Go is that as corrected by 2 groups (Yan and Wold, 1984; Carles et al., 1988). It differs from the original sequence (Eigel et al., 1984) in 4 places: Gln117Glu, Pro 137 and Leu 138 are inverted, Glu175Gln, and Gln195Glu. The changes at residues 117 and 175 are confirmed by both groups and by gene sequencing. The inversion of residues 137 and 138 are not in agreement with cDNA sequencing data (Jimenez-Flores et al., 1987), which is in accordance with the original data. However the Leu-Pro substitution is a one base change, and mutations could occur and not be observed by HPLC-mass spectroscopy (MS) of peptides or by electrophoresis of the proteins. The weight here is, however, given to the 2 independent protein-sequencing reports. In a similar fashion, the change at 195 is not in agreement with the cDNA results, but, in this case, 3 lines of evidence support the occurrence of only Glu at residue 195. They include the following:



View larger version (29K):
[in this window]
[in a new window]
 
Figure 3. Primary structure of Bos ß-CN A2-5P (Ribadeau-Dumas et al., 1972; Grosclaude et al. 1973). The amino acid residues corresponding to the mutational differences in the genetic variants, A1, A3, B, C, D, E, F, G, H, and I are indicated in Table 2Go. Sites of post-translational phosphorylation (SeP) are indicated in italicized, boldface type. The arrows indicate the points of attack by plasmin responsible for ß-CN fragments ({gamma}-CN and proteose peptones) present in milk.

 

The previous report of Eigel et al. (1984) described 7 genetic variants. Since that revision, 3 new variants have been identified by sequence: ß-CN F, previously called ß-CN X (Visser et al., 1995); ß-CN G (Dong and Ng-Kwai-Hang, 1998); and ß-CN H (Han et al., 2000). The amino acid substitutions giving rise to all variants of ß-CN are given in Table 2Go. In addition, Chung et al. (1995) identified variant A4 in native Korean cattle using electrophoresis only; its substitutions in the A2 reference protein are unknown.

Visser et al. (1995) identified ß-CN F, which contains the A1 substitution and Leu for Pro at residue 152. ß-casein F was separated by preparative reverse-phase HPLC. The main differential peaks representing the 114 to 169 fragments of ß-A1 and ß-X, respectively, were both purified following trypsin digestion, cyanogen bromide cleavage, and separation of the corresponding peptides representing the 145–156 sequence. The presence of Leu residue at position 152 instead of the Pro-152 in ß-CN A1 was established by fast-atom bombardment MS-MS. In accordance with internationally accepted guidelines for the nomenclature of milk proteins, the new genetic variant has been named ß-CN F-5P. In a similar fashion, Dong and Ng-Kwai-Hang (1998) identified ß-CN G-5P, which is similar to ß-CN A1 and F, but contains a Leu in place of Pro at either position 137 or 138, depending on the sequence assigned, as the Pro-Leu inversion is controversial. Han et al. (2000) identified ß-CN H, which represents 2 substitutions relative to the corrected reference ß-CN A2. These are Arg25 to Cys and Leu88 to Ile. A genetic variant, discovered by Senocq et al. (2002) was also named H; so, it is proposed that the Han variant be termed H1, and the Senocq variant be termed H2. The H2 variant differs from the A2 variant at 2 known positions (Met93Leu and Gln72Glu) and a substitution of Gln to Glu between residues 114–169. Finally, the I variant was described by Jann et al. (2002); it contains only the Met 93Leu substitution of the H2 variant.

ß-Casein is the most hydrophobic of the CN. The N-terminal sequence codes for charged amino acids as well as a phosphoserine cluster. This initial sequence is different from the second half of the molecule, where neutral and hydrophobic amino acid residues abound. Calculation of the net charge at pH 6.6 indicates that the first 21 amino acids would have a net charge of about –11.5, and the C-terminal 21 amino acids (190–209) have no net charge. This molecule presents a high contrast in its sequence, one-tenth of the amino acids at the N-terminus of the protein contain one-third of the total charge, while 75% of the residues at the C-terminal one-tenth consist of hydrophobic amino acids. It is this unusual distribution of amino acids that leads to the release of ß-CN from CN micelles in the cold (Aoki et al., 1990). It is important to mention that no 3-D structure from X-ray crystallography has been reported; however, a computer-generated working 3-D model has been presented (Kumosinski et al., 1993b). Perhaps the difficulty on inducing suitable crystals from this protein is due to its dependence on the environment surrounding it and its propensity for self-association.

Addition of a glycosylation signal in the gene of ß-CN has been reported (Choi and Jimenez-Flores, 1996). This modification of a bovine milk protein is an important aspect of this review because the original gene was generated from ß-CN A1 with the glycosylation signal, changing from Pro 67 to Ser 67. However, this new variant represents a man-made intervention and does not occur naturally. We suggest that nomenclature of genetic variants induced through molecular biology techniques follow the same mechanisms as those established by this Committee for naturally occurring variants, e.g., a point mutation of ß-CN A1, Pro67Ser 67.

{kappa}-CN
The {kappa}-CN family consists of a major carbohydrate-free component and a minimum of 6 minor components. The 6 minor components, as detected by PAGE in urea with 2-mercaptoethanol (Mackinlay and Wake, 1965; Pujolle et al., 1966; Woychik et al., 1966; Vreeman et al., 1977; Doi et al., 1979), represent varying degrees of phosphorylation and glycosylation.

{kappa}-casein, as isolated from milk, also occurs in the form of a mixture of disulfide-bonded polymers ranging from dimer to octamers and above (Groves et al., 1992). Beeby (1964) reported the presence of free thiol groups after calcium removal by treatment with ethylenediaminetetraacetate, but other chemical analyses did not confirm this result (Swaisgood et al., 1964). Sodium dodecyl sulfate-gel electrophoresis and physical measurements suggest that the native form of {kappa}-CN is highly associated both chemically and physically (Swaisgood and Brunner, 1963; Groves et al., 1992; Farrell et al., 1996) and that heat treatment of native {kappa}-CN results in aggregation caused by free sulfhydryl-disulfide interchange (Groves et al., 1998). Reduction and S-carboxymethylation of {kappa}-CN followed by heating can result in amyloid (fibrillar) structures (Farrell et al., 2002).

The primary structure of the reference protein of the {kappa}-CN family is the major carbohydrate-free component of {kappa}-CN A-1P (Figure 4Go); its ExPASy entry name and file number are CASK_Bovin and P02668, respectively. It consists of 169 amino acid residues as follows: Asp4, Asn8, Thr15, Ser12, Ser P1, Pyroglu1, Glu12, Gln14, Pro20, Gly2, Ala14, Cys2, Val11, Met2, Ile12, Leu8, Tyr9, Phe4, Lys9, His3, Trp1, and Arg5, with a formula molecular weight of 19,037. There is still some question about the presence of the N-terminal pyroglutamyl residue in the native protein, as cyclization may occur during isolation (Swaisgood, 1975). In addition to protein-chemical sequencing, the cDNA of {kappa}-CN has been sequenced (Stewart et al., 1984), and the {kappa}-CN gene sequence is complete (Alexander et al., 1988). The {kappa}-CN signal peptide is composed of 21 amino acid residues, making the pre-form 190 amino acids in length.



View larger version (23K):
[in this window]
[in a new window]
 
Figure 4. Primary structure of Bos {kappa}-CN A-1P (Mercier et al., 1973). The amino acid residues corresponding to the mutational differences in the B through J variants are given in Table 2Go. The arrow indicates the point of attack by chymosin (rennin). The * indicates pyroglutamate as the cyclized N-terminal. The site of post-translational phosphorylation (SeP) is indicated in italicized, boldface type; residues that may potentially be phosphorylated are underlined.

 
The 2 common genetic variants are designated A and B (Neelin, 1964; Woychik, 1964). {kappa}-casein B-1P differs from the A variant (Figure 4Go) by substitution of an Ile residue for Thr at position 136 and an Ala residue for Asp at position 148 (Mercier et al., 1973). The A variant tends to be predominant in most dairy breeds with the exception of Jersey cattle (Thompson and Farrell, 1974; Bech and Kristiansen, 1990; Ng-Kwai-Hang and Grosclaude, 2003). In alkaline gel electrophoresis, in the presence of mercaptoethanol and urea, both variants show multiple bands; the A variant possesses the greater mobility (Mackinlay et al., 1966; Swaisgood, 1975). In addition, 9 other genetic variants have been reported (Table 2Go). The C and E variants were characterized by digestion with cyanogenbromide (Miranda et al., 1993); C differs from A by substitution of His 97 for Arg 97; E differs from A by substitution of Gly 155 for Ser 155. The C variant has been confirmed by PCR as well (Schlee and Rottman, 1992). The letter D was ascribed to a new variant by PAGE analysis, but this variant was later found to be identical to {kappa}-CN C (Erhardt, 1989). The incorrect identification of {kappa}-CN D indicates the need for sequence analysis (PCR, protein, etc.) to confirm new genetic variants for all CN. {kappa}-Casein F was discovered by PCR analysis of both Zebu and Black and White hybrid cattle (Sulimova et al., 1992). This analysis revealed 2 nucleotide changes between {kappa}-CN A and {kappa}-CN F: a G for T in the second position coding for Thr 145 (which yields no change in the protein) and a T for G in the second position of Asp 148 (which yields Val 148 in the F variant). This latter protein should be termed F1, as Prinzenberg et al. (1996) by PCR analysis described a second F variant that contains the A-B changes (residues 136, 148) as well as an A for G substitution, which yields a change from Arg 10 to His 10. This variant (Arg10His) could be considered to be F2, as the same researchers (Prinzenberg et al., 1996) named their next discovered variant G. This new variant ({kappa}-CN G) was reported in Alpine breeds solely on the basis of isoelectric focusing gels (Erhardt, 1996) but was confirmed as a point mutation by PCR; this G variant causes Arg 97 of {kappa}-CN B to be changed to Cys 97. Again, this variant could be termed G1, as Sulminova et al. (1996) found another variant of {kappa}-CN in yak (Bos grunniens) that they also termed {kappa}-CN G; it differs from the {kappa}-CN A by an Asp148Ala mutation, and the codons for residues 167 and 168 are different but yield no changed protein phenotype. This latter variant (Asp148Ala) could be termed G2. The reasoning for the superscript nomenclature is that Prinzenberg et al. (1999) identified yet another variant in Prinzegauer cattle and termed it {kappa}-CN H. The H protein deviates from the A variant by a Thr135Ile mutation. Coincidently, it is identical with {kappa}-CN A-Zebu found by Grosclaude et al. (1974). In another study, Prinzenberg et al. (1999) described {kappa}-CN I, which differs from the A variant by a Ser104Ala change. Finally, {kappa}-CN J was discovered in Bos taurus cattle on the Ivory Coast by Mahé et al. (1999); this variant appears to have arisen from the {kappa}-CN B variant by a Ser155Arg mutation.

The bond sensitive to chymosin (EC 3.4.23.4) (rennin) hydrolysis occurs between Phe 105 and Met 106 (Figure 4Go) (Delfour et al., 1965; Jollés et al., 1968). The hydrolytic products are para-{kappa}-CN (residues 1–105) and the macropeptide (residues 106–169). Doi et al. (1979) and Vreeman et al. (1977, 1986) have observed para-{kappa}-CN in purified preparations of {kappa}-CN. This is undoubtedly due to a chymosin-like proteolysis subsequent to translation, but more work must be done before concluding that para-{kappa}-CN is a natural constituent of milk or a product of storage or of the preparatory processes. It is interesting to note that of the 11 known variants, 8 occur in the distal portion of the macropeptide, relatively far removed from the point of chymosin attack. These mutations range from positions 136 to 155 and occur in the extended portion of the molecule, which serves as a physical deterrent to coagulation prior to the action of chymosin. Small, relatively neutral changes in this portion of the molecule may not adversely affect cheese making. Perhaps more interesting are the changes in the C, F2, and G1 variants, which occur in the para-{kappa}-CN portion of the molecule. The F2 variant is functionally identical in that the net charge remains constant (Arg10His). The C variant may be of interest as Arg 97 has been implicated in the possible attraction of chymosin to the CN micelle, but the positive charge is conserved by the His 97 substitution. In a similar way, the G1 variant in which Arg 97 is converted to Cys 97, could further influence micelle structure as the new sulfhydryl residue could promote unusual disulfide linkages close to the chymosin cleavage site.

The major component (~50%) of all {kappa}-CN variants is generally believed to be the carbohydrate-free component. However, the post-translational modifications of {kappa}-CN, which result in the formation of minor components, have been studied in considerable detail, and their degree of complexity is correlated with the degree of sophistication of the instrumentation used to study them. Generally, the minor {kappa}-CN components are multiglycosylated and/or multiphosphorylated forms of the major {kappa}-CN. Vreeman et al. (1977, 1986) concluded from their investigations that, in order of elution from DEAE-cellulose, the adsorbed {kappa}-CN were the major components free of carbohydrate with one phosphate group followed by 6 minor components differing in degrees of glycosylation and phosphorylation. Doi et al. (1979) concluded from their fractionation of {kappa}-CN on DEAE-cellulose that there were 4 major and 2 minor components, all containing one phosphate group and various degrees of glycosylation. The major fraction was the carbohydrate-free component.

Several researchers have investigated the structure of carbohydrate moieties (Tran and Baker, 1970; Fiat et al., 1972; Jollés et al., 1972, 1973; Fournet et al., 1975; Jollés et al., 1978). Fournet et al. (1975) isolated 3 oligosaccharides from {kappa}-CN and determined the structures for 2. Saito and Itoh (1992) confirmed their structures and added 3 more; all of these structures are given in Figure 5Go. A composite summary of the reported glycosyl moeities, their molecular weights, and relative percentage occurrence is given in Table 3Go. From these data, it appears as though the complex structures C, D, and E are most prevalent. Mollé and Léonil (1995) used reverse-phase HPLC in conjunction with on-line electrospray ionization MS (EIMS) to characterize the distribution of glycosyl residues further within the {kappa}-CN macropeptide for the A variant. They found at least 14 glycosylated forms, a glycosylated and non-phosphorylated form, and multiphosphorylated forms (1P at 78%, 2P at 20%, and 3P at 2%) totaling 18 reported species attributable to post-translational modification. The EIMS data on HPLC followed the absorbance data and confirmed the structures of Saito and Itoh (1992) except for A, the monosaccharide.



View larger version (14K):
[in this window]
[in a new window]
 
Figure 5. Reported structures for glycosyl residues attached to {kappa}-CN macropeptide (Saito and Itoh, 1992).

 

View this table:
[in this window]
[in a new window]
 
Table 3. Summary of properties and occurrence of O-glycosyl residues attached to {kappa}-CN macropeptides.
 
The points of attachment of the glycosyl chains have also received a good deal of attention. Again because of the high degree of heterogeneity and small amounts of some forms, the specific sites of attachment of various chains shown in Figure 5Go are not clearly defined. Further advances in analytical tools, such as EIMS, could confirm this. The Thr residues at positions 131, 133, or 135 (Fiat et al., 1972; Jollés et al., 1973; Fournet et al., 1975; Kanamori et al., 1980) were initially identified as points of attachment of oligosaccharide chains through O-glycosidic linkages. If an oligosaccharide is attached to {kappa}-CN A-1P at Thr 136, the B variant could not contain an oligosaccharide at this position, as Ile replaces Thr (Figure 5Go, Table 4Go). Although reports of monosaccharide attachments to para-{kappa}-CN exist (Wheelock et al., 1969, 1973), para-{kappa}-CN has generally been reported to be devoid of carbohydrate, as monoglycosylated forms of proteins can be formed artifactually (Pisano et al., 1994; Mollé and Léonil, 1995).


View this table:
[in this window]
[in a new window]
 
Table 4. Summary of the reported sites of O-glycosylation of {kappa}-CN macropeptide.
 
The carbohydrate present on colostral {kappa}-CN is more complex and variable than that of normal milk and was reviewed in the last report of this Committee (Eigel et al., 1984). However, a salient feature is that only Thr residues 131, 133, and 135 have been identified as points of attachment for the complex oligosaccharides bound to colostral {kappa}-CN (Doi et al., 1980). The earliest reported sites, Thr 131 and 133 (Pujolle et al., 1966; Mercier et al., 1973), have been confirmed, and others have been added to the list. The most complete study was conducted by Pisano et al. (1994) who demonstrated up to 6 sites for O-glycosylation. A summary of the reported sites is given in Table 4Go.

From all of these data, it would appear that for {kappa}-CN glycosylation, structures C, D, and E are statistically most prevalent, and Thr residues 131 and 133 are the sites most populated by these glycosyl moieties. Two reasons postulated for the prevalence of glycosyl residues at the 131 and 133 sites are that the proline turns, which bound this region, maintain its surface orientation (Kumosinski et al., 1993a) and that glycosylation at other sites is restricted by the neighboring Ile (or Val) residues (Pisano et al., 1994), which are highly prevalent in the {kappa}-CN macropeptide. In studies with whole CN, it is important to remember that the major {kappa}-CN bands for the A and B variants (40 to 60%) are not glycosylated; therefore, differences in the minor bands could be related to a variety of factors specific to the milk sample, not the least of which is genetic variation (Pisano et al., 1994; Mollé and Léonil, 1995).

Because of the high degree of heterogeneity and the limited amounts of the minor components of {kappa}-CN, we feel that the precise nomenclature of these components still cannot be achieved at this time. We suggest that they be identified according to the genetic variant of the major nonglycosylated component and that isolated fractions, which contain post-translational modifications, be numbered consecutively according to either their increasing relative electrophoretic mobility in alkaline urea gels or their elution from anion exchange media in the presence of mercaptoethanol. For example, starting with {kappa}-CN A, the nonglycosylated 1P form would be designated {kappa}-1, and then subsequent bands would be termed {kappa}-2, {kappa}-3, {kappa}-4, etc. This is in accord with most current working definitions of isolated fractions.


    THE WHEY PROTEINS
 TOP
 ABSTRACT
 INTRODUCTION
 THE CASEINS
 THE WHEY PROTEINS
 REFERENCES
 
The term whey proteins has been used to describe the group of milk proteins that remain soluble in milk serum or whey after precipitation of CN at pH 4.6 and 20°C. Traditionally, ß-LG, {alpha}-LA, serum albumin (SA), Ig, and proteose-peptone fractions have been considered the major characterized components of this fraction. Because the current and most frequently used method of assessing the integrity of this fraction is SDS-PAGE, LF, a major component that is readily visualized by this technique, should be added to this list (Figure 6Go). It is also recognized that proteolytic fragments of CN (Eigel et al., 1984) and fat globule membrane proteins (Mather et al., 2000) occur in the traditional whey fraction, raising questions concerning the utility of this term. However, based on current knowledge, the term whey protein should be used only in a general sense to describe milk proteins soluble at pH 4.6 and 20°C. Commercial products termed whey protein isolates or concentrates are obtained from cheese manufacture at higher pH and will contain intact caseins as well as their proteolytic products, such as macropeptides and proteose-peptone fraction. Individual families, such as ß-LG, {alpha}-LA, SA, and LF, should be classified according to homology with the primary sequence of their amino acid chains. Polyacrylamide or starch gel electrophoresis still can be used to characterize and identify individual members of each family. Immunoglobulins, proteins not unique to milk, are the products of B-lymphocytes and are the result of somatic gene segment rearrangement and somatic mutation. With 1 million variants, Ig lend themselves poorly to traditional biochemical characterization. Immunochemical criteria continue to be used for laboratory diagnosis and quantitation of Ig, but molecular genetics is heavily used for structural analyses.



View larger version (109K):
[in this window]
[in a new window]
 
Figure 6. Discontinuous SDS-PAGE of dialyzed whey (pH 4.6) and individual purified whey proteins. 1 = dialyzed whey (pH 4.6); 2 = lactoferrin; 3 = BSA; 4 = bovine IgG, Cohn fraction II heavy chain (top) and light chain (bottom); 5 = ß-LG; 6 = {alpha}-LA; 7 = dialyzed whey (pH 4.6); and 8 = standard purified proteins. Refer to Figure 10aGo for more details on Ig proteins. All samples were reduced with 2-mercaptoethanol (Basch et al., 1985).

 
ß-LG
ß-Lactoglobulin is the major protein in whey. Both the A and B genetic variants occur at high frequency in most breeds of cow, and the presence of one or the other of these 2 variants affects the properties of the milk markedly (Jakob and Puhan, 1992; Hill et al., 1996), partly because of the different physico-chemical characteristics of the ß-LG molecules themselves and partly because the A variant is expressed at a higher level than the B variant (Aschaffenburg and Drewry, 1957) or the C variant (Ng-Kwai-Hang and Grosclaude, 1992; 2003; Hill et al., 1996). The latter effect suggests that there may be some differences among the non-coding DNA sequences of ß-LG variants. The gene sequence was published by Alexander et al. (1993), and the 5' regions were explored more recently (Wagner et al., 1994; Geldermann et al., 1996). The ß-LG signal peptide is composed of 16 amino acids, making the pre-form of ß-LG 178 amino acids in length.

The reference protein for this family, ß-LG B, consists of 162 amino acids and has the following composition: Asp10, Asn5, Thr8, Ser7, Glu16, Gln9, Pro8, Gly4, Ala15, Cys5, Val9, Met4, Ile10, Leu22, Tyr4, Phe4, Lys15, His2, Trp2, and Arg3. The calculated formula molecular weight is 18,277, and the measured molecular weight is 18,278.35 ± 2.2 Da (Léonil et al., 1995) or 18,277.0 ± 0.9 (Burr et al., 1997); its ExPASy entry name and file number are LACB_Bovin and P02754, respectively. The primary sequence shown in Figure 7Go is unchanged since the 1984 review (Eigel et al., 1984). However, the disulfide bonds in the native protein are now unambiguously determined as Cys 66 to Cys 160 and Cys 106 to Cys 119, with Cys 121 as the source of the free thiol (Papiz et al., 1986; Bewley et al., 1997; Brittan et al., 1997; Brownlow et al., 1997; Qin et al., 1998a, b; 1999). The calculated formula weight of 18,277 takes these disulfide linkages into account.



View larger version (29K):
[in this window]
[in a new window]
 
Figure 7. Primary structure of bos ß-LG B (Eigel et al., 1984). The free sulfhydryl group is on Cys121 in the native form of the protein (Bewley et al., 1997; Brittan et al., 1997; Brownlow et al., 1997; Qin et al., 1998a,b, 1999). The two disulfide bridges are residues 66 to 160 and 106 to 120. The genetic variant substitutions are given in Table 2Go for the A (Eigel et al.,1984), C (Eigel et al., 1984), D (Eigel et al., 1984), Dr. (Eigel et al., 1984), E (Eigel et al., 1984), F (Eigel et al., 1984), G (Eigel et al., 1984), I (Godovac-Zimmermann et al., 1996), J (Godovac-Zimmermann et al., 1996), and W (Godovac-Zimmermann et al., 1990) variants. The sequence positions of the major secondary structural features, {alpha}-helix, helical regions, and ß-strands (ß-A to ß-I), are shown above the main sequence. The elements shown were determined for the triclinic (lattice X) form (Brownlow et al., 1997), and there are slight differences, particularly at the ends of the assigned structural elements, for the other crystal forms (Bewley et al., 1997; Qin et al., 1999) and at different pH (Qin et al., 1998a). The PDB (protein structural database) file for this protein is 1B8E.

 
A number of new variants have been identified, and many have been sequenced (as indicated in Table 2Go). New variants are H (Conti et al., 1988; Davoli et al., 1988), I (Godovac-Zimermann et al., 1996), J (Godovac-Zimmermann et al., 1996), and W (Godovac-Zimmerman et al., 1990). It should be noted that the sequences of ß-LG E, F, G, and Dr do not appear to have been examined since the last review (Eigel et al., 1984), and, thus, the precise sequence site of some of the variation has not been clarified. The amino acid differences for ß-LG H have not been completely verified, so the validity of its uniqueness has not been demonstrated; Ng-Kwai-Hang and Grosclaude (2003) have suggested Lys70Asn. ß-Lactoglobulin J was previously denoted as ß-LG X by Baranyi et al. (1993). Another new variant is ß-LG W, which has been named out of alphabetic sequence, is a "silent" variant, involving the substitution of an uncharged Ile56 residue by another uncharged Leu56 residue (Godovac-Zimmerman et al., 1990). It was observed using the slight difference in mobility by isoelectric focusing, although no difference in mobility was observed by normal alkaline PAGE. Mass spectrometry, especially MS-MS, is probably the method of choice to confirm new variants, but it may not be suitable for ß-LG W (to identify the proposed Ile/Leu substitution).

The publication of the first high-resolution X-ray crystal structures of the triclinic form (lattice X) of ß-LG A/B (Brownlow et al., 1997), the orthorhombic form (lattice Y) of ß-LG A, B, and C (Bewley et al., 1997), and the trigonal form (lattice Z) of ß-LG A and B (Qin et al., 1998a) has verified the earlier medium resolution structure (Papiz et al., 1986) in general terms and has corrected some earlier errors. The amino acid sequences corresponding to the {alpha}-helix, the ß-sheet strands (A to I), and several 310 turns for the lattice X form are indicated in Figure 7Go. There are slight differences among the crystal lattice forms as to the details of the secondary structural elements. In all cases, the ß-I strand is an important feature of the interface between the 2 monomers that constitute the dimer in all of the high resolution crystal structures. The NMR structures (Kuwata et al., 1999; Uhrínová et al. 2000) at about pH 2.5 contain the same strands and helix, confirming the polypeptide fold.

Another feature of bovine ß-LG is the ability to bind hydrophobic and amphiphilic molecules ranging from hexane to palmitic acid to vitamin D (Hambling et al., 1992; Pérez and Calvo, 1995; Narayan and Berliner, 1997; Sawyer, 2003). Considerable attention has been paid to the binding of retinol (vitamin A), which is essential for mammalian growth and well being, to ß-LG, and ß-LG is considered a member of the lipocalin family of proteins (Sawyer, 2003). Although some retinoids and fatty acids can bind in the deep hydrophobic pocket of ß-LG (Cho et al., 1994; Qin et al., 1998b; Wang et al., 1999; Kontopidis et al., 2002), there is some doubt about the biological role of this protein. The original biological role could have been related to maternal physiology, but this may have shifted to a more nutritional role for some species (Kontopidis et al., 2002).

The observation that ß-LG may be glycosylated (Léonil et al., 1997) prior to milking, but probably external to the mammary epithelium, is interesting and suggests a chemical rather than a biochemical reaction. More extensive modification (lactosylation) occurs in heated milk or whey (Burr et al, 1996; Léonil et al., 1997; Morgan et al., 1998), where Lys47 and Lys91 are the most reactive, and Lys8 and Lys141 are the least reactive (Morgan et al., 1998).

The practice of naming a new variant "X" until such time as the sequence has been demonstrated should save embarrassment and/or duplication of protein names. However, there is the possibility that such a ß-LG X could be confused with the X, Y, and Z lattice forms, a nomenclature used by the crystallographers. The various engineered proteins often have slight differences at the N-terminus of the protein because of the method of synthesis. (For example, Kim et al. [1997] expressed a ß-LG in which the N-terminal sequence was Glu-Ala-Glu-Ala-Tyr-Val-Thr-, whereas it is Leu-Ile-Val-Thr- in the natural proteins [Figure 7Go]). It is recommended that such proteins be clearly labelled so that they are not confused with the naturally synthesized proteins, because the difference in structure would give the proteins different properties, such as electrophoretic mobility.

{alpha}-LA
The whey protein {alpha}-LA has a specific and defined physiological function in the mammary gland. Within the Golgi apparatus of the mammary epithelial cell, {alpha}-LA interacts with the ubiquitously expressed enzyme ß-1,4-galactosyltransferase to form the lactose synthase complex. {alpha}-lactalbumin modifies the substrate specificity of ß-1,4-galactosyltransferase, allowing the formation of lactose from glucose and UDP-galactose. The constitutive function of ß-1,4-galactosyltransferas, to glycosylate glycoproteins and glycolipids is reversibly altered by combining with {alpha}-LA in a 1:1 molar ratio. The production of lactose, its function as the major osmolyte of milk, and the function of the lactose synthase complex, have been the topics of a number of reviews (Brew and Hill, 1975; Hill and Brew, 1975; Jones, 1977; Kuhn et al., 1980).

Bovine milk contains {alpha}-LA at a concentration of approximately 1.2 to 1.5 g/L (Jenness, 1974). The protein has been sequenced, and the nucleotide sequence has been confirmed (Vilotte et al., 1987; Brew et al., 1970; Bleck and Bremel, 1993a). The mature {alpha}-LA (Figure 8Go) is a 123-amino acid globular protein (Brew et al., 1970). The {alpha}-LA signal peptide is composed of 19 amino acids, making the pre-form of {alpha}-LA 142 amino acids in length (Gaye et al., 1987; Hurley and Schuler, 1987). The mature {alpha}-LA protein has 2 predominant genetic variants (A and B) that have been confirmed by sequence analysis (Table 2Go) (Bhattacharya et al., 1963). The B variant is present in the milk of most Bos taurus cattle, and both the A and B variants are found in Bos indicus cattle (Jenness, 1974). {alpha}-lactalbumin A variant is present at a low frequency in some Italian and Eastern European Bos taurus breeds (Mariani and Russo, 1977). The A variant contains a Glu at position 10 of the mature protein, and the B variant has an Arg substitution at that position (Gordon, 1971). A third genetic variant, {alpha}-LA C, has also been reported but not confirmed by DNA or protein sequencing (Bell et al., 1981). This variant was identified in Bali cattle (Bos javanicus). The C variant was reported to differ from the B variant by having either an Asn for Asp or a Gln for Glu substitution. The B variant is the reference protein for the family and is composed of the following amino acid residues: Ala3, Arg1, Asn8, Asp13, Cys8, Gln6, Glu7, Gly6, His3, Ile8, Leu13, Lys12, Met1, Phe4, Pro2, Ser7, Thr7, Trp4, Tyr4, Val6 (Brew et al., 1970). Both the A and B variant contain 4 disulfide bonds. The B variant has a formula molecular weight of 14,178; its sequence is given in Figure 8Go and has been corrected since the last report to take into account the nucleic acid sequences noted previously. The ExPASy entry name for {alpha}-LA is LACA_ Bovin, and its file number is P00741. {alpha}-lactalbumin has a very high content of the essential amino acids (Trp, Phe, Tyr, Leu, Ile, Thr, Met, Cys, Lys, and Val). Essential amino acids account for 63.2% of the total amino acid content compared with just 51.4% for total CN (Heine et al., 1991). The amino acid composition of bovine {alpha}-LA and its 72% sequence identity to human {alpha}-LA makes it an ideal protein for the nutrition of human infants (Heine et al., 1991).



View larger version (17K):
[in this window]
[in a new window]
 
Figure 8. Primary sequence of Bos {alpha}-LA B (Brew et al., 1970; Vanaman et al., 1970). The disulfide bridges in the molecule are between positions 6 and 120, 28 and 111, 61 and 77, and 73 and 91. The amino acid residues corresponding to the differences in genetic variants A and C are given in Table 2Go. The PDB (protein structural database) file for this protein is 1F6S.

 
A small percentage of the {alpha}-LA found in the milk of cattle is glycosylated on an Asn residue (Barman, 1970). The N-linked glycosylation signal occurs at amino acids 45 to 47 (Asn-Gln-Ser) of mature {alpha}-LA (Figure 8Go). The reason why only a small portion of the protein is glycosylated is not clear. The mechanism that has been suggested is that the glycosylation machinery has poor accessibility to the glycosylation site in the mature folded protein (Pless and Lennarz, 1977). However, examination of its x-ray structures (Chrysina et al., 2000) does not support this concept. The potential physiological relevance of the glycosylation has not been determined. Bovine {alpha}-LA is not phosphorylated in its native form (Bingham et al., 1988). However, {alpha}-LA becomes a good substrate for CN kinase in vitro after it has been reduced and carboxymethylated (Bingham et al., 1988).

In dairy cattle, the concentration of {alpha}-LA in milk decreases near the end of a lactation (Caffin et al., 1985; Regester and Smithers, 1991). This is opposite of what occurs for the other major bovine milk proteins; their concentrations tend to increase as a lactation progresses (Davies and Law, 1980). The decline in {alpha}-LA concentration is correlated with the decline observed in the concentration of milk lactose at the end of a lactation. Lower concentrations of {alpha}-LA have also been observed in cows that have mammary infections (Caffin et al., 1985).

The protein structure, amino acid sequence, and DNA sequence of {alpha}-LA are very similar to that of the c-type lysozymes (McKenzie and White, 1991), and t