Python (re)

HTTP Content-Type Header in PY

Parse the value of an HTTP Content-Type header, capturing the media type and optional charset.

Pattern

regexPY

Content-Type:\s*([a-z]+\/[\w.+\-]+)(?:;\s*charset=([\w\-]+))?   (flags: gi)

Python (re) code

pyPython

import re

pattern = re.compile(r"Content-Type:\s*([a-z]+\/[\w.+\-]+)(?:;\s*charset=([\w\-]+))?", re.IGNORECASE)
input_text = "Content-Type: text/html; charset=utf-8"
for m in pattern.finditer(input_text):
    print(m.group(0))

Stdlib `re` module — no third-party dependency. Works on Python 3.6+.

How the pattern works

Content-Type:\s* matches the header name and required whitespace. ([a-z]+\/[\w.+\-]+) captures the type/subtype (the i flag makes this case-insensitive). (?:;\s*charset=([\w\-]+))? optionally captures a charset parameter — UTF-8, iso-8859-1, etc.

Examples

Input

Content-Type: text/html; charset=utf-8

Matches

Content-Type: text/html; charset=utf-8

Input

content-type: application/json

Matches

content-type: application/json

Input

Authorization: Basic xyz

No match

—

Same pattern, other engines

JavaScript / ECMAScript

Supported

See how this pattern looks (and behaves) in JavaScript's built-in RegExp.

Go (RE2)

Supported

See how this pattern looks (and behaves) in Go's `regexp` package (RE2 engine).

← Back to HTTP Content-Type Header overview (all engines)