Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.0 issues #37

Open
2 of 6 tasks
ChrisRackauckas opened this issue Sep 1, 2018 · 6 comments
Open
2 of 6 tasks

v1.0 issues #37

ChrisRackauckas opened this issue Sep 1, 2018 · 6 comments

Comments

@ChrisRackauckas
Copy link
Member

ChrisRackauckas commented Sep 1, 2018

Here's the v1.0 issues that occurred:

  • Need Traceur.jl on Julia v1.0 for WhyJulia (@pfitzseb ?)
  • MultivariateStats, Clustering needs an update for clustering and dimensional reduction problems
  • Update word2vec and dimensionality reduction JLD (@oxinabox to JLD2?)
  • Plots.jl plotly() plots error
  • Could not spawn animation for Plots.jl
  • StatPlots.jl doesn't precompile
@oxinabox
Copy link
Contributor

oxinabox commented Sep 3, 2018

Update word2vec and dimensionality reduction JLD

Actually, just replace the file loading with Embeddings.jl.

@ChrisRackauckas
Copy link
Member Author

It doesn't look like it's a drop-in replacement?

using Embeddings 
embeddings = load_embeddings(Word2Vec)
Embeddings.EmbeddingTable{Array{Float32,2},Array{String,1}}(Float32[0.0673199 0.0529562  -0.21143 0.0136373; -0.0534466 0.0654598  -0.0087888 -0.0742876;  ; -0.00733469 0.0108946  -0.00405157 0.0156112; -0.00514565 -0.0470722  -0.0341579 0.0396559], ["</s>", "in", "for", "that", "is", "on", "##", "The", "with", "said"    "#-###-PA-PARKS", "Lackmeyer", "PERVEZ", "KUNDI", "Budhadeb", "Nautsch", "Antuane", "tricorne", "VISIONPAD", "RAFFAELE"])

all_words = collect(keys(embeddings))
display(all_words)
embeddings_mat = hcat(getindex.([embeddings], all_words)...)
MethodError: no method matching keys(::Embeddings.EmbeddingTable{Array{Float32,2},Array{String,1}})
Closest candidates are:
  keys(!Matched::Core.SimpleVector) at essentials.jl:580
  keys(!Matched::Cmd) at process.jl:837
  keys(!Matched::DataFrames.Index) at /home/chrisrackauckas/.julia/packages/DataFrames/utxEh/src/other/index.jl:66
  ...

Stacktrace:
 [1] top-level scope at In[15]:1

@oxinabox
Copy link
Contributor

oxinabox commented Sep 3, 2018

It isn't drop in, but it is really close.
Those lines are not required as those are the fields of the type returned by load_embeddings

I think though a restricted list of works should be passed in to the vocals param though to keep it from loading hundreds of thousands

@ChrisRackauckas
Copy link
Member Author

all_words = embeddings.vocab
display(all_words)
embeddings_mat = embeddings.embeddings

? I get an out of memory error after that.

@ChrisRackauckas
Copy link
Member Author

Oh that's why it should be restricted.

@pfitzseb
Copy link

pfitzseb commented Sep 7, 2018

Traceur works on 1.0 now, btw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants