BCP 47 Language Tag in PY
Validate BCP 47 language tags like `en`, `en-US`, `zh-Hant-TW`, or `pt-BR`.
Try it in the PY tester →Pattern
regexPY
^[a-zA-Z]{2,3}(?:-[A-Za-z0-9]{2,8})*$Python (re) code
pyPython
import re
pattern = re.compile(r"^[a-zA-Z]{2,3}(?:-[A-Za-z0-9]{2,8})*$")
input_text = "en"
for m in pattern.finditer(input_text):
print(m.group(0))Stdlib `re` module — no third-party dependency. Works on Python 3.6+.
How the pattern works
^[a-zA-Z]{2,3} matches the primary 2- or 3-letter language subtag. (?:-[A-Za-z0-9]{2,8})* matches subsequent subtags (script, region, variant) joined with hyphens. Each subtag is 2–8 alphanumerics. This validates STRUCTURE — for spec-compliant validation use a real BCP 47 parser.
Examples
Input
enMatches
en
Input
zh-Hant-TWMatches
zh-Hant-TW
Input
ENGLISHNo match
—