HTML Entity in PY
Match HTML entities in named (`&`), numeric (`{`), or hex (`💩`) form.
Try it in the PY tester →Pattern
regexPY
&(?:[a-zA-Z][a-zA-Z0-9]+|#\d+|#x[0-9a-fA-F]+); (flags: g)Python (re) code
pyPython
import re
pattern = re.compile(r"&(?:[a-zA-Z][a-zA-Z0-9]+|#\d+|#x[0-9a-fA-F]+);")
input_text = "Tom & Jerry <3"
for m in pattern.finditer(input_text):
print(m.group(0))Stdlib `re` module — no third-party dependency. Works on Python 3.6+.
How the pattern works
The leading `&` and trailing `;` bracket the entity. The middle alternation matches: a named entity ([a-zA-Z][a-zA-Z0-9]+ — letters then alphanumerics, like `amp`, `lt`, `nbsp`); a decimal entity (#\d+, like `#160`); or a hex entity (#x[0-9a-fA-F]+, like `#xA0` or `#x1F600` for emoji).
Examples
Input
Tom & Jerry <3Matches
&<
Input
Numeric:   Hex: 😀Matches
 😀
Input
no entities hereNo match
—