Skip to content
Gábor Recski edited this page Feb 25, 2016 · 8 revisions

A project for developing tools to measure various types of word similarity and measure them on various datasets.

Subprojects / Division of labour (tentative)

0/a Acquire existing datasets and embeddings

0/b Build OpenRoget dataset

  1. Build small ML framework for measuring word similarity / synonym detection
  2. Measure all similarities on all datasets
  3. Develop 4lang similarity

Technical

  • Progress should be documented on this wiki
  • Discussions should take place on the mailing list [email protected]
  • Datasets should be stored under nessi6:/mnt/store/home/hlt/wordsim

Meetings, milestones

  • Intro to 4lang similarity: February 3rd, 10.30 am, SZTAKI Rm 506
  • ACL short paper deadline: Feb 29th, long paper: March 18th, StarSem: April 18th

Bibliography

Clone this wiki locally