GSoC 2025 - Using AI for Unicode Spoof Detection #6212

RaidedCluster · 2025-02-28T09:32:08Z

RaidedCluster
Feb 28, 2025

Hello everyone!

I'd applied for GSoC 2024 for the same Project Idea.

I'm a sophomore and I've also been doing some AI safety stuff at Anthropic & the UK AISI.

The last time I applied, I was working on a paper that illustrated how confusables and non-standard Unicode characters impact LLMs.

The paper's done now, but the hunt for confusables is still on.
https://arxiv.org/abs/2405.14490

This was my last year's proposal.
https://drive.google.com/file/d/1pPbhmrMwegTn5kNJUzC0rRulUzAfis3X/view?usp=sharing

The past year has really helped me to dwell more on how vast Unicode is and I've come across clusters of confusables, that I glossed over because of tofu 𓊪 𓊪 𓊪(oops spilled some).

Looking back, there were a lot of drawbacks in this approach.

I'll correct them in my new proposal.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GSoC 2025 - Using AI for Unicode Spoof Detection #6212

{{title}}

Replies: 0 comments

Select a reply

GSoC 2025 - Using AI for Unicode Spoof Detection #6212

RaidedCluster Feb 28, 2025

Replies: 0 comments

RaidedCluster
Feb 28, 2025