How to Remove Invalid UTF-8 Characters from a String in Go


Using ToValidUTF8 Function

The strings.ToValidUTF8() function returns a copy of the string s with each run of invalid UTF-8 byte sequences replaced by the replacement string, which may be empty.

The following example should cover whatever you are trying to do:

package main

import (

func main() {
  s := "a\xc5bd"

  s = strings.ToValidUTF8(s, "")
  fmt.Printf("%q\n", s)

Using Map function

In Go 1.11+, it's also very easy to do the same using the Map function and utf8.RuneError like this:

package main

import (

func main() {
  s := "a\xc5bd"

  valid := func(r rune) rune {
    if r == utf8.RuneError {
      return -1
    return r
  s = strings.Map(valid, s)
  fmt.Printf("%q\n", s)

Using Range

For example,

package main

import (

func ToValid(s string) string {

  if utf8.ValidString(s) {
    return s

  v := make([]rune, 0, len(s))
  for i, r := range s {
    if r == utf8.RuneError {
      _, size := utf8.DecodeLastRuneInString(s[i:])
      if size == 1 {

    v = append(v, r)

  return string(v)

func main() {
  s := "a\xc5b\x8ad"

  s = ToValid(s)
  fmt.Printf("%q\n", s)

Related Tags