The post Understanding Rune in Golang appeared first on Welcome To Golang By Example.
]]>rune in Go is an alias for int32 meaning it is an integer value. This integer value is meant to represent a Unicode Code Point. To understand rune you have to know what Unicode is. Below is short description but you can refer to the famous blog post about it –
Unicode is a superset of ASCII characters which assigns a unique number to every character that exists. This unique number is called Unicode Code Point.For eg
Visit https://en.wikipedia.org/wiki/List_of_Unicode_characters to know about Unicode Point of other characters. But Unicode doesn’t talk about how these code points will be saved in memory. This is where utf-8 comes into picture
utf-8 saves every Unicode Point either using 1, 2, 3 or 4 bytes. ASCII points are stored using 1 byte. That is why rune is an alias for int32 because a Unicode Point can be of max 4 bytes in Go as source code in GO is encoded using utf-8, hence every string is also encoded in utf-8
Every rune is intended to refer to one Unicode Point. For eg if you print a string after typecasting it to a rune array then it will print the Unicode Point for each of character. For for below string “0b£” output will be – [U+0030 U+0062 U+00A3]
fmt.Printf("%U\n", []rune("0b£"))
Declare Rune
A rune is declared using a character between single quotes like below declaring a variable named ‘rPound’
rPound := '£'
After declaring Rune you can perform below things as well
fmt.Printf("Type: %s\n", reflect.TypeOf(rPound))
fmt.Printf("Unicode CodePoint: %U\n", rPound)
fmt.Printf("Character: %c\n", r)
Below is the code illustrating each point we discussed
package main
import (
"fmt"
"reflect"
"unsafe"
)
func main() {
r := 'a'
//Print Size
fmt.Printf("Size: %d\n", unsafe.Sizeof(r))
//Print Type
fmt.Printf("Type: %s\n", reflect.TypeOf(r))
//Print Code Point
fmt.Printf("Unicode CodePoint: %U\n", r)
//Print Character
fmt.Printf("Character: %c\n", r)
s := "0b£"
//This will print the Unicode Points
fmt.Printf("%U\n", []rune(s))
//This will the decimal value of Unicode Code Point
fmt.Println([]rune(s))
}
Output:
Size: 4
Type: int32
Unicode CodePoint: U+0061
Character: a
[U+0030 U+0062 U+00A3]
[48 98 163]
package main
import "fmt"
func main() {
runeArray := []rune{'a', 'b', '£'}
s := string(runeArray)
fmt.Println(s)
}
Output:
ab£
package main
import "fmt"
func main() {
s := "ab£"
r := []rune(s)
fmt.Printf("%U\n", r)
}
Output:
[U+0061 U+0062 U+00A3]
The post Understanding Rune in Golang appeared first on Welcome To Golang By Example.
]]>The post Character in Go (Golang) appeared first on Welcome To Golang By Example.
]]>Golang does not have any data type of ‘char‘. Therefore
To declare either a byte or a rune we use single quotes. While declaring byte we have to specify the type, If we don’t specify the type, then the default type is meant as a rune.
To declare a string, we use double quotes or backquotes. Double quotes string honors escape character while back quotes string is a raw literal string and doesn’t honor any kind of escaping.
See the program below. It shows
package main
import (
"fmt"
"reflect"
"unsafe"
)
func main() {
//If you don't specify type here
var b byte = 'a'
fmt.Println("Priting Byte:")
//Print Size, Type and Character
fmt.Printf("Size: %d\nType: %s\nCharacter: %c\n", unsafe.Sizeof(b), reflect.TypeOf(b), b)
r := '£'
fmt.Println("\nPriting Rune:")
//Print Size, Type, CodePoint and Character
fmt.Printf("Size: %d\nType: %s\nUnicode CodePoint: %U\nCharacter: %c\n", unsafe.Sizeof(r), reflect.TypeOf(r), r, r)
s := "µ" //Micro sign
fmt.Println("\nPriting String:")
fmt.Printf("Size: %d\nType: %s\nCharacter: %s\n", unsafe.Sizeof(s), reflect.TypeOf(s), s)
}
Output:
Priting Byte:
Size: 1
Type: uint8
Character: a
Priting Rune:
Size: 4
Type: int32
Unicode CodePoint: U+00A3
Character: £
Priting String:
Size: 16
Type: string
Character: µ
constant 285 overflows byte
invalid character literal (more than one character)
The post Character in Go (Golang) appeared first on Welcome To Golang By Example.
]]>