Skip to content

Tracking issue for RFC 2457, "Allow non-ASCII identifiers" #55467

Closed
@Centril

Description

@Centril
Contributor

This is a tracking issue for the RFC "Allow non-ASCII identifiers" (rust-lang/rfcs#2457).

Steps:

Unresolved questions:

  • Which context is adequate for confusable detection: file, current scope, crate?
    How are non-ASCII idents best supported in debuggers?
    Resolved: DWARF and debuggers handle UTF-8 just fine
    Which name mangling scheme is used by the compiler? (Punycode, see RFC2603)
    Is there a better name for the less_used_codepoints lint?
    Resolved in favour of uncommon_codepoints
    Which lint should the global mixed scripts confusables detection trigger?
    Resolved in favor of mixed_script_confusables
    How badly do non-ASCII idents exacerbate const pattern confusion
    (Statics shadow local variables causing "refutable pattern error", and non-obvious bugs. #7526, We shouldn't even try to resolve irrefutable patterns as constants #49680)?
    Can we improve precision of linting here?
    In mixed_script_confusables, do we actually need to make an exception for Latin identifiers?
    Terminal width is a tricky with unicode. Some characters are long, some have lengths dependent on the fonts installed (e.g. emoji sequences), and modifiers are a thing. The concept of monospace font doesn't generalize to other scripts as well. How does rustfmt deal with this when determining line width?
    right-to-left scripts can lead to weird rendering in mixed contexts (depending on the software used), especially when mixed with operators. This is not something that should block stabilization, however we feel it is important to explicitly call out. Future RFCs (preferably put forth by RTL-using communities) may attempt to improve this situation (e.g. by allowing bidi control characters in specific contexts).
    Tweak XID_Start / XID_Continue? XID_Start / XID_Continue might not be quite right #4928

    http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1518.htm

    The ISO JTC1/SC22/WG14 (C language) think that possibly UTR#31 didn't quite hit the nail on the head in terms of defining identifier syntax. They have a couple tweaks in mind. Consider following their lead.


zulip channel topic for real-time discussion:
https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/nonascii.20identifiers(rfc.202457)

Activity

Manishearth

Manishearth commented on Oct 29, 2018

@Manishearth
Member

last unresolved question isn't a real unresolved question, it was included in the RFC for completeness but does not block this issue.

Centril

Centril commented on Oct 29, 2018

@Centril
ContributorAuthor

@joshtriplett Please check that the list of checkboxes above are satisfactory. :)

@Manishearth alright; leave a note under it to that effect?

Manishearth

Manishearth commented on Oct 29, 2018

@Manishearth
Member

The note saying so is already in the unresolved q

8573

8573 commented on Oct 29, 2018

@8573

Is there a better name for the less_used_codepoints lint?

Substituting "rare" or "unusual" for "less used" seems to me a simple, if not necessarily final, improvement, replacing the somewhat awkward "less used" with a single, shorter, more usual synonym.

(Edit: I note that I personally oppose allowing non-ASCII identifiers, but I recognize that the Rust Team favors it, and I have no problem bowing to their decision and chipping in my cents to help.)

Manishearth

Manishearth commented on Oct 29, 2018

@Manishearth
Member
Serentty

Serentty commented on Oct 29, 2018

@Serentty
Contributor

I would prefer “rare” as it sounds more objective to me than “unusual”, and perhaps less judgemental as well.

eaglgenes101

eaglgenes101 commented on Nov 1, 2018

@eaglgenes101

My first thought was "uncommon", but that's not strong enough of an adjective to get the intended meaning across.

Centril

Centril commented on Nov 1, 2018

@Centril
ContributorAuthor

I'm partial towards "rare" as well; rare_codepoints is pretty short and sweet.

203 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    B-RFC-approvedBlocker: Approved by a merged RFC but not yet implemented.B-RFC-implementedBlocker: Approved by a merged RFC and implemented but not stabilized.B-unstableBlocker: Implemented in the nightly compiler and unstable.C-tracking-issueCategory: An issue tracking the progress of sth. like the implementation of an RFCF-non_ascii_idents`#![feature(non_ascii_idents)]`T-langRelevant to the language teamdisposition-mergeThis issue / PR is in PFCP or FCP with a disposition to merge it.finished-final-comment-periodThe final comment period is finished for this PR / Issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @ehuss@comex@fitzgen@eddyb@nikomatsakis

        Issue actions

          Tracking issue for RFC 2457, "Allow non-ASCII identifiers" · Issue #55467 · rust-lang/rust