Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http.DetectContentType do not return gbk charset #27461

Closed
sanguohot opened this issue Sep 3, 2018 · 2 comments
Closed

http.DetectContentType do not return gbk charset #27461

sanguohot opened this issue Sep 3, 2018 · 2 comments

Comments

@sanguohot
Copy link

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.11 windows/amd64.

Does this issue reproduce with the latest release?

sure it is.

What operating system and processor architecture are you using (go env)?

windows7 amd64.

What did you do?

If possible, provide a recipe for reproducing the error.
this is my test code

package main

import (
	"fmt"
	"go-ethereum/common/hexutil"
	"log"
	"net/http"
	"golang.org/x/text/transform"
	"golang.org/x/text/encoding/simplifiedchinese"
	"strings"
	"io/ioutil"
)
func DecodeToGBK(utf8Str string) (dst string, err error) {
	var trans transform.Transformer = simplifiedchinese.GBK.NewEncoder()
	var reader *strings.Reader = strings.NewReader(utf8Str)
	var transReader *transform.Reader = transform.NewReader(reader, trans)
	bytes, err := ioutil.ReadAll(transReader)
	if err != nil {
		return
	}
	dst = string(bytes)
	return
}

func EncodeFromGBK(gbkStr string) (utf8Str string, err error) {
	var trans transform.Transformer = simplifiedchinese.GBK.NewDecoder()
	var reader *strings.Reader = strings.NewReader(gbkStr)
	var transReader *transform.Reader = transform.NewReader(reader, trans)
	bytes, err := ioutil.ReadAll(transReader)
	if err != nil {
		return
	}
	utf8Str = string(bytes)
	return
}
func main() {
	bytes, err := hexutil.Decode("0x68656c6c6f20776f726c64210d0aced2cac7bae3dec8313131313232323232")
	if err != nil {
		log.Fatal(err)
	}
	gbkStr := string(bytes)
	fmt.Println("http.DetectContentType ===>", http.DetectContentType(bytes))
	fmt.Println("gbkStr ===>", gbkStr)
	gbkToUtf8Str, err := EncodeFromGBK(gbkStr)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println("gbkToUtf8Str ===>", gbkToUtf8Str)
}

What did you expect to see?

I want to see 'text/plain; charset=gbk', so that the browse could show me the right text.
But sadly it was not.

What did you see instead?

the output was:

http.DetectContentType ===> text/plain; charset=utf-8
gbkStr ===> hello world!
���Ǻ���111122222
gbkToUtf8Str ===> hello world!
我是恒奕111122222

@agnivade
Copy link
Contributor

agnivade commented Sep 3, 2018

https://golang.org/pkg/net/http/#DetectContentType

DetectContentType implements the algorithm described at https://mimesniff.spec.whatwg.org/ to determine the Content-Type of the given data.

We just implement the algorithm described there. The mimesniff algo (https://mimesniff.spec.whatwg.org/#identifying-a-resource-with-an-unknown-mime-type) only seems to return utf-8 and utf-16. I am going to mark this as working as intended. I would request you to raise an issue in https://github.com/whatwg/mimesniff/ for us to be able to incorporate it.

@sanguohot
Copy link
Author

@agnivade Hey, I raise an issue on whatwg/mimesniff#77

@golang golang locked and limited conversation to collaborators Sep 7, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants