Go (RE2)

Non-ASCII Character in GO

Match runs of non-ASCII characters (anything outside U+0000–U+007F).

Try it in the GO tester →

Pattern

regexGO
[^\x00-\x7F]+   (flags: g)

Go (RE2) code

goGo
package main

import (
	"fmt"
	"regexp"
)

func main() {
	re := regexp.MustCompile(`[^\x00-\x7F]+`)
	input := `Hello, café!`
	for _, match := range re.FindAllString(input, -1) {
		fmt.Println(match)
	}
}

Uses `regexp.MustCompile` (panics on bad patterns at startup) and `FindAllString` for all matches.

How the pattern works

[^\x00-\x7F] is a negated character class: anything NOT in the ASCII range 0x00–0x7F. The trailing + groups consecutive non-ASCII characters into a single match (so `café` matches as `é`, `naïve` as `ï`, etc.). Useful for finding accented characters, emoji, CJK, and other Unicode in otherwise-ASCII source.

Examples

Input

Hello, café!

Matches

  • é

Input

naïve résumé 🎉

Matches

  • ï
  • é
  • é
  • 🎉

Input

plain ascii here

No match

Same pattern, other engines

← Back to Non-ASCII Character overview (all engines)