Text Processingflags: g

Non-ASCII Character

Match runs of non-ASCII characters (anything outside U+0000–U+007F).

Available in

JavaScript / ECMAScript code

jsJavaScript

const re = new RegExp("[^\\x00-\\x7F]+", "g");
const input = "Hello, café!";
const matches = [...input.matchAll(re)];
console.log(matches.map(m => m[0]));

Uses `String.prototype.matchAll` for global iteration (Node 12+ / all modern browsers).

Python (re) code

pyPython

import re

pattern = re.compile(r"[^\x00-\x7F]+")
input_text = "Hello, café!"
for m in pattern.finditer(input_text):
    print(m.group(0))

Stdlib `re` module — no third-party dependency. Works on Python 3.6+.

Go (RE2) code

goGo

package main

import (
	"fmt"
	"regexp"
)

func main() {
	re := regexp.MustCompile(`[^\x00-\x7F]+`)
	input := `Hello, café!`
	for _, match := range re.FindAllString(input, -1) {
		fmt.Println(match)
	}
}

Uses `regexp.MustCompile` (panics on bad patterns at startup) and `FindAllString` for all matches.

Pattern

regexengine-agnostic

[^\x00-\x7F]+   (flags: g)

Raw source: [^\x00-\x7F]+

How it works

[^\x00-\x7F] is a negated character class: anything NOT in the ASCII range 0x00–0x7F. The trailing + groups consecutive non-ASCII characters into a single match (so `café` matches as `é`, `naïve` as `ï`, etc.). Useful for finding accented characters, emoji, CJK, and other Unicode in otherwise-ASCII source.