Skip to content

Latest commit

 

History

History
23 lines (18 loc) · 578 Bytes

File metadata and controls

23 lines (18 loc) · 578 Bytes

datasci

Note: The basic framework for some of the scripts was used from external online sources such Coursera, StackOverflow.

twitterStream/

  • generating twitter streams
  • sentiment analysis
  • filtering by region
  • top ten hashtags

mapReduce/

  • simple implementations of breaking a problem into key-value pairs and using MapReduce to cluster the values corresponding a key

pig/

  • scripts to filter the "billion triple dataset" according to different fields

sql/

  • implementation of basic sql operations
  • sparse matrix
  • similarity matrix