Python (re)

BCP 47 Language Tag in PY

Validate BCP 47 language tags like `en`, `en-US`, `zh-Hant-TW`, or `pt-BR`.

Try it in the PY tester →

Pattern

regexPY
^[a-zA-Z]{2,3}(?:-[A-Za-z0-9]{2,8})*$

Python (re) code

pyPython
import re

pattern = re.compile(r"^[a-zA-Z]{2,3}(?:-[A-Za-z0-9]{2,8})*$")
input_text = "en"
for m in pattern.finditer(input_text):
    print(m.group(0))

Stdlib `re` module — no third-party dependency. Works on Python 3.6+.

How the pattern works

^[a-zA-Z]{2,3} matches the primary 2- or 3-letter language subtag. (?:-[A-Za-z0-9]{2,8})* matches subsequent subtags (script, region, variant) joined with hyphens. Each subtag is 2–8 alphanumerics. This validates STRUCTURE — for spec-compliant validation use a real BCP 47 parser.

Examples

Input

en

Matches

  • en

Input

zh-Hant-TW

Matches

  • zh-Hant-TW

Input

ENGLISH

No match

Same pattern, other engines

← Back to BCP 47 Language Tag overview (all engines)