Skip to content

Conversation

@yiqingzhang
Copy link

@yiqingzhang yiqingzhang commented Dec 15, 2016

I find out that when supplying a short name text with extended ASCII code (e.g., "Arès Méroueh") will generate an error in the results's 'originalText' tag. The reason is that requests lib will guess the response's encoding. In the case of "Arès Méroueh", it will give result like ISO8859-1, which is not true. According to the standford coreNLP. The default response encoding is utf-8. Adding this line( r.encoding = 'utf-8') will eliminate error like this.

In order to make the code more robust, one may change the interface and allow user to specify response encoding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant