Python (re)

HTTP Content-Type Header in PY

Parse the value of an HTTP Content-Type header, capturing the media type and optional charset.

Try it in the PY tester →

Pattern

regexPY
Content-Type:\s*([a-z]+\/[\w.+\-]+)(?:;\s*charset=([\w\-]+))?   (flags: gi)

Python (re) code

pyPython
import re

pattern = re.compile(r"Content-Type:\s*([a-z]+\/[\w.+\-]+)(?:;\s*charset=([\w\-]+))?", re.IGNORECASE)
input_text = "Content-Type: text/html; charset=utf-8"
for m in pattern.finditer(input_text):
    print(m.group(0))

Stdlib `re` module — no third-party dependency. Works on Python 3.6+.

How the pattern works

Content-Type:\s* matches the header name and required whitespace. ([a-z]+\/[\w.+\-]+) captures the type/subtype (the i flag makes this case-insensitive). (?:;\s*charset=([\w\-]+))? optionally captures a charset parameter — UTF-8, iso-8859-1, etc.

Examples

Input

Content-Type: text/html; charset=utf-8

Matches

  • Content-Type: text/html; charset=utf-8

Input

content-type: application/json

Matches

  • content-type: application/json

Input

Authorization: Basic xyz

No match

Same pattern, other engines

← Back to HTTP Content-Type Header overview (all engines)