Text Processingflags: g

Non-ASCII Character

Match runs of non-ASCII characters (anything outside U+0000–U+007F).

Try it in RegexPro →

Available in

Pattern

regexengine-agnostic
[^\x00-\x7F]+   (flags: g)

Raw source: [^\x00-\x7F]+

How it works

[^\x00-\x7F] is a negated character class: anything NOT in the ASCII range 0x00–0x7F. The trailing + groups consecutive non-ASCII characters into a single match (so `café` matches as `é`, `naïve` as `ï`, etc.). Useful for finding accented characters, emoji, CJK, and other Unicode in otherwise-ASCII source.

Examples

Input

Hello, café!

Matches

  • é

Input

naïve résumé 🎉

Matches

  • ï
  • é
  • é
  • 🎉

Input

plain ascii here

No match

Common use cases

  • Auditing source code for non-ASCII identifiers
  • Encoding-bug detection in legacy data
  • Building i18n test cases
  • Migrating between charsets

Related patterns