HTTP Content-Type Header in PY
Parse the value of an HTTP Content-Type header, capturing the media type and optional charset.
Try it in the PY tester →Pattern
regexPY
Content-Type:\s*([a-z]+\/[\w.+\-]+)(?:;\s*charset=([\w\-]+))? (flags: gi)Python (re) code
pyPython
import re
pattern = re.compile(r"Content-Type:\s*([a-z]+\/[\w.+\-]+)(?:;\s*charset=([\w\-]+))?", re.IGNORECASE)
input_text = "Content-Type: text/html; charset=utf-8"
for m in pattern.finditer(input_text):
print(m.group(0))Stdlib `re` module — no third-party dependency. Works on Python 3.6+.
How the pattern works
Content-Type:\s* matches the header name and required whitespace. ([a-z]+\/[\w.+\-]+) captures the type/subtype (the i flag makes this case-insensitive). (?:;\s*charset=([\w\-]+))? optionally captures a charset parameter — UTF-8, iso-8859-1, etc.
Examples
Input
Content-Type: text/html; charset=utf-8Matches
Content-Type: text/html; charset=utf-8
Input
content-type: application/jsonMatches
content-type: application/json
Input
Authorization: Basic xyzNo match
—