Skip to content
Shawna C Scott edited this page Jul 27, 2014 · 2 revisions

What kind of duplication will we need to handle? Thoughts on dealing with event duplication.

**Labels:**Phase-Requirements,duplication

What kinds of event duplication will we need to handle?

  • Giving people tools to sort out the incoming pieces is important
  • Being able to set a canonical event to cluster duplicates around
  • Being able to delete pure duplicate content or mistake

**** Having the ability to track what has been deleted in case of mistakes, some sort of versioning

Thoughts on handling event duplication

http://www.rubyinside.com/bloom-filters-a-powerful-tool-599.html can be used to help dedupe - effectively you can make 'fingerprints' of things. -Anselm Hook

Perhaps problem can be somewhat ameliorated by not scraping events from calendars that are merely secondary sources -- they contain only (or almost only) events that appear elsewhere on the web.

INITIAL REVIEW NEEDED