This interdisciplinary survey aims to serve as a comprehensive resource for researchers and practitioners who work at the intersection of NLP, Multimodal AI, and patent analysis, as well as patent offices to build efficient patent systems.
USPTO-2M - Deeppatent: patent classification with convolutional neural networks and word embedding
BIGPATENT - BIGPATENT: A Large-Scale Dataset for Abstractive and Coherent Summarization
USPTO-3M - Patent Classification by Fine-Tuning BERT Language Model
PatentMatch - Patentmatch: A dataset for matching patent claims & prior art
DeepPatent - DeepPatent: Large scale patent drawing recognition and retrieval
DeepPatent2 - DeepPatent2: A Large-Scale Benchmarking Corpus for Technical Drawing Understanding
HUPD - The Harvard USPTO Patent Dataset: A Large-Scale, Well-Structured, and Multi-Purpose Corpus of Patent Applications
IMPACT - MPACT: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents