Amino Acid Notation Converter

Convert peptide sequences between 1-letter and 3-letter amino acid notation. Supports multiple separator formats.

Conversion direction:
3-letter output format:
Ala-Cys-Gly Ala Cys Gly Ala.Cys.Gly AlaCysGly H-Ala-Cys-OH
Quick examples:
CYIQNCPLG (Oxytocin) RPPGFSPFR (Bradykinin) YGGFM (Met-Enkephalin) DRVYIHPFHL (Angiotensin I) RPKPQQFFGLM (Substance P) HSDGTFT… (Secretin)

All 20 Standard Amino Acids

Click any row to append that amino acid to the input above. Colors indicate side-chain chemical class.

Nonpolar / hydrophobic Polar / uncharged Positively charged (basic) Negatively charged (acidic) Special (Glycine)

Non-Standard & Ambiguous IUPAC Codes

Beyond the 20 canonical amino acids, IUPAC defines additional single-letter codes for ambiguous residues and two rare genetically encoded amino acids.

B
Asx — Asp or Asn
Used when sequencing cannot distinguish aspartate (D) from asparagine (N). Both have similar masses and are interconverted by deamidation.
Z
Glx — Glu or Gln
Ambiguity code for glutamate (E) or glutamine (Q). Arises when mass spectrometry or Edman degradation cannot definitively assign the residue.
X
Xaa — Unknown residue
A placeholder for any amino acid — used when the identity of a residue is completely unknown, or for variable positions in sequence alignments and peptide libraries.
U
Sec — Selenocysteine
The 21st genetically encoded amino acid. Identical to cysteine but with selenium replacing sulfur. Present in ~25 human selenoproteins including glutathione peroxidase. Encoded by UGA — normally a stop codon, repurposed by a specific mRNA structure called the SECIS element.
O
Pyl — Pyrrolysine
The 22nd genetically encoded amino acid — found only in certain methanogenic archaea. Encoded by the UAG amber stop codon. Contains a pyrroline ring on a lysine backbone. Its unique chemistry is now exploited in bioorthogonal labelling for site-specific protein modification.
J
Xle — Leu or Ile
Ambiguity code for leucine (L) or isoleucine (I) — two isobaric amino acids with identical formulas (C₆H₁₁NO). Standard mass spectrometry cannot distinguish them; only ion-mobility MS or specific chemical derivatization can tell them apart.

Peptide Notation Conventions

🔤
Why Two Systems?

The 1-letter code was introduced in 1968 (Dayhoff et al.) to enable efficient computational sequence storage and comparison. Before that, biochemists used only the 3-letter code, which remains standard in structural biology and synthesis papers. Today: 1-letter code dominates bioinformatics and sequence databases (UniProt, NCBI); 3-letter is preferred in synthetic chemistry, pharmacology, and drug patents.

➡️
N→C Direction & the H-…-OH Format

Peptide sequences are always written N-terminus → C-terminus (left to right). The H-Ala-Cys-Gly-OH format makes termini explicit: H- is the free amino group (H₂N-) at the N-terminus; -OH is the free carboxyl (-COOH) at the C-terminus. This matters in synthetic chemistry, where terminal modifications — acetylation (Ac-), amidation (-NH₂), PEGylation — must be clearly specified and alter the peptide's charge and stability.

🔄
D- and L- Stereoisomers

All standard amino acids except glycine have a chiral centre and exist as L- and D-forms. Natural proteins use exclusively L-amino acids (homochirality). D-amino acids do occur naturally: bacterial cell walls contain D-Ala and D-Glu; frog venom peptides dermorphin and deltorphins contain D-Ala, making them protease-resistant and highly potent at opioid receptors. In drug design, replacing L- with D-residues at protease cleavage sites is a key strategy for extending peptide half-life in vivo.

🔁
Cyclic Peptides & Special Notations

Linear notation cannot describe cyclic peptides. Common conventions: cyclo(sequence) for head-to-tail cyclics (e.g. cyclosporin A); disulfide bridges are noted as Cys-Cys or by underlining. Common terminal modifications in notation: Ac- (N-terminal acetylation, removes positive charge and improves stability), -NH₂ (C-terminal amidation, removes negative charge), pGlu- (pyroglutamate — a cyclised N-terminal glutamine found naturally in TRH, neurotensin, and several other hormones).