Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross-link to external docsets #13

Open
jpsim opened this issue Jul 3, 2014 · 22 comments
Open

Cross-link to external docsets #13

jpsim opened this issue Jul 3, 2014 · 22 comments

Comments

@jpsim
Copy link
Collaborator

jpsim commented Jul 3, 2014

No description provided.

@jpsim jpsim added the frontend label Jul 3, 2014
@jpsim
Copy link
Collaborator Author

jpsim commented Nov 1, 2014

Linking to Apple's docs and other projects should be possible, but will likely be a fairly difficult task.

@segiddins
Copy link
Collaborator

If you get the module that an external symbol refers to, this might be possible in cocoadocs

@jpsim
Copy link
Collaborator Author

jpsim commented Nov 1, 2014

Agreed.

@istx25
Copy link
Contributor

istx25 commented Nov 23, 2016

What's the progress on this?

@jpsim
Copy link
Collaborator Author

jpsim commented Nov 26, 2016

No one's done any work on this as far as I can tell.

@1ec5
Copy link
Collaborator

1ec5 commented Jan 9, 2017

Invoking a command like this:

xcrun docsetutil search -skip-text -query CLLocation ~/Library/Developer/Shared/Documentation/DocSets/com.apple.adc.documentation.iOS.docset

yields output like this:

 Swift/cl/-/CLLocation   documentation/CoreLocation/Reference/CLLocation_Class/index.html#//apple_ref/swift/cl/c:objc(cs)CLLocation
 Objective-C/cl/-/CLLocation   documentation/CoreLocation/Reference/CLLocation_Class/index.html#//apple_ref/occ/cl/CLLocation

We can construct a URL based on one of these file path–apple_ref combinations:

https://developer.apple.com/documentation/CoreLocation/Reference/CLLocation_Class/index.html#//apple_ref/occ/cl/CLLocation

which redirects to:

https://developer.apple.com/reference/corelocation/cllocation#//apple_ref/occ/cl/CLLocation

The query can be the name of a class, method, etc., but the redirect for a method only goes to the class reference. We can specify multiple symbols at a time, separated by spaces.

@1ec5 1ec5 changed the title Cross-linking Cross-link to external docsets Jan 9, 2017
@jpsim
Copy link
Collaborator Author

jpsim commented Jan 10, 2017

Nice finds @1ec5!

@galli-leo
Copy link

With reference to my other comment (#190), should this be integrated into SourceKitten or jazzy? Additionally, do you think just reading the sqlite database and text-searching for module, then symbol is enough or should I try to reverse engineer the Xcode frameworks some more?

@johnfairh
Copy link
Collaborator

I think it would go in Jazzy as a last step in resolving autolinks. Jazzy already has a dependency on the Ruby sqlite3 gem (for the docset builder) which groks that db fine, so should be fast to query it.

@galli-leo
Copy link

@johnfairh Gotcha. Do you know if cocoadocs also store the search json file? The few doc pages from there I tried, didn‘t seem to have them :(. Because if so, we could even add doc links to all cocoapods.

@galli-leo
Copy link

I will try to get something working today and throw up a pr for discussion

@galli-leo
Copy link

galli-leo commented Jun 16, 2018

Whooo some good news! Finally figured out how Xcode generates the uuids for the sqlite db. This should simplify this immensely. It's just a shortened sha1 of the usr of a symbol (provided it's actually a symbol).

Regarding whether to implement this in jazzy or sourcekitten, we definitely have to rework autolinking in jazzy, because on that step, we do not have any usr information anymore. Also I think this is either better suited for sourcekitten or a seperate app, since we need to execute cursorinfo for every symbol to get the usr.

Edit: After actually looking at the sourcekitten output, that should be enough to get it working (the fully annotated decl already contains the usr for any external symbol). However, imo it would be nice to integrate that into sourcekitten, since then others, beside jazzy could also profit from that.

Edit 2: Hmm, it seems that everything contains a usr link (if available). Maybe it would be a good idea to use that for autolinking instead of searching by name?

@galli-leo
Copy link

Ok so my current implementation idea would be:

  1. Use annotated_decl if available (looks like this: <Declaration>public override var description: <Type usr="s:SS">String</Type> { get }</Declaration>)
  2. Read that as xml and "strip" any tags that we don't need. Keep the Type tag (or any with usr info, haven't come across that though). Transform it so it's always the same and convert the usr to an external / internal link. (e.g. <USRLINK url="...">...</USRLINK>
  3. Write a custom rouge lexer that scans for that and returns a custom token. (Already implemented that)
  4. Modify the HTML lexer to output an a tag with the url. (Already implemented that)

With this "new system", should the old autolinking methods still be kept? Or should we replace anything we find with the same USRLINK tokens and let the lexer & highlighter handle everything (IMO the better option)? Unfortunately, I haven't found a way to convert the dot notation to a usr (if anyone has an idea that would be great), so autolinking to external docs will be difficult. We can still use just text searching, but that will probably not be as reliable.

Additionally, we could even start linking code blocks inside markdown by passing the code blocks to sourcekit's index request and then inserting the provided usrs. (Though that's probably out of scope)

@1ec5
Copy link
Collaborator

1ec5 commented Jun 16, 2018

The existing autolinking methods are also used for references to symbols within backticks in documentation comments. I don’t think those references get marked up with USRs.

@galli-leo
Copy link

@1ec5 Yeah that‘s what I meant with dot notation. So there we would either have to go back to the old system or resolve the usrs ourselves using the old methods.

@johnfairh
Copy link
Collaborator

Nice detective work!

My 2p: I think that searching the DB (approx select reference_path from map where reference_path like '%/uiviewcontroller/%' etc. for method references) as an addition to the current autolink resolver would be a good first step. This would address Swift declarations, Objective-C declarations, and references to types/methods made from markdown docs or doc comments. I understand this might not be 100% accurate but I feel it will be pretty successful.

Then look at the USRLINK part and rearranging the data structures to be USR-based as a separate piece for Swift -- there are definite good reasons for doing that, performance, features, that last few % accuracy. But, there are common places this doesn't work (objc decls, markdown) so prefer to do 'good enough' on general case first.

On the sourcekitten/jazzy thoughts - maybe you could output an extra JSON object from sourcekitten doc that maps from USRs to either apple doc URLs or the DB uuid, where those USRs were accumulated and uniqued from the preceding doc json.

@galli-leo
Copy link

galli-leo commented Jun 17, 2018

@johnfairh Maybe we could add another output to sourcekitten that‘s parsed_with_links? Because using the USRLINK seems easier than doing a fuzzy text search, especially for Types. (I mostly implemented the userlink approach already :P)

Yeah it won‘t work for markdown comments or Objective-C, but we could run them through cursorinfo as well (If they are just types) or do fuzzy textsearch to create a USRLINK object as well.

@galli-leo
Copy link

galli-leo commented Jun 17, 2018

So I got it working quite nicely in SourceKitten:

public class Test: Swift.CustomStringConvertible
"key.parsed_annotated_decl" : "public class Test : Swift.<USRLINK usr=\"s:s23CustomStringConvertibleP\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/customstringconvertible\">CustomStringConvertible<\/USRLINK>"

"key.parsed_annotated_decl" : "public func test(first: <USRLINK usr=\"s:SS\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/string\">String<\/USRLINK>, second: <USRLINK usr=\"s:SS\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/string\">String<\/USRLINK>, closure: @escaping ((<USRLINK usr=\"s:SS\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/string\">String<\/USRLINK>, <USRLINK usr=\"s:SS\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/string\">String<\/USRLINK>) -> (<USRLINK usr=\"s:SS\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/string\">String<\/USRLINK>, <USRLINK usr=\"s:SS\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/string\">String<\/USRLINK>))) throws -> <USRLINK usr=\"s:SS\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/string\">String<\/USRLINK>",
            "key.parsed_declaration" : "public func test(first: String, second: String, closure: @escaping ((String, String) -> (String, String))) throws -> String",

I even got it working for parsing arbitrary types quite easily (e.g. in a doc comment with Test), by just calling cursorinfo with the type as text. (So this works even with Apple types in doc comments!)

The next step would be parsing arbitrary function or property references. My idea would be (instead of using the old system), to have SourceKitten "register" any usrs it finds, with their parent(s) and parameters, and when parsing doc comments, go back to that list and find anything that matches.

This way we can even add a new command to SourceKitten that resolves a usr or any bit of pseudocode to get a documentation reference and / or usr.

Let me know, what you think. We can also just have SourceKitten link anything with a concrete usr and then autolink anything else in jazzy using the existing system. But I think implementing this in SourceKitten makes more sense, as other people can also profit as well as provide a replacement for the old docsetutils.

Should I create a pr with the SourceKitten changes already? It's really messy atm.

@johnfairh
Copy link
Collaborator

If we have something that works everywhere (swift/objc/markdown) then I'm happy. A replacement for docsetutils sounds like a useful tool.

My only concern -- that I won't go on about any more after this because it feels like I'm just sitting here criticising while you do work!! -- is complexity: if we have the 'sql match' fallback we may as well just always use it. I pushed a small (hacky, incomplete, not widely tested...) sketch of this to this branch just to check we're on the same page. Resolves types String/NSMutableArray / UIViewController etc. but not method refs eg. UIViewController.prepare(for:sender:).

I think I understand your go-back-into-sourcekitten approach. A technical problem might be that type references in markdown / doc comments etc. may not actually compile (missing imports). I can see the attraction in treating code more like compilable code than parsable text.

If you think you're on the right track in sourcekitten then can't hurt to put up a PR so that team can look if they've bandwidth.

@galli-leo
Copy link

galli-leo commented Jun 19, 2018

@johnfairh I decided to throw up my current code in a pr for criticism (jpsim/SourceKitten#537). Might want to take a look as well, since it's mostly about architectural decisions.

My only concern -- that I won't go on about any more after this because it feels like I'm just sitting here criticising while you do work!! -- is complexity: if we have the 'sql match' fallback we may as well just always use it.

If you take a look at my pr, querying by reference path will not be used. Instead we create an index of all usrs and then can resolve any "dot notation" into a usr and use that to link (external or into the same module). Or use the usr if already available / use cursorinfo for types. Additionally, please criticise as much as you like, bad design doesn't help anyone :P.

A technical problem might be that type references in markdown / doc comments etc. may not actually compile (missing imports). I can see the attraction in treating code more like compilable code than parsable text.

That's why I have "multiple" ways of finding a usr. First use cursorinfo. If it compiles and finds a type, great we are done here. Else search the "index" for any matches and then either add the url or create the reference to the usr.

Do you know if clang gives any info about usrs? I haven't had time to look at the non sourcekitten side.

Sidenote: By implementing this in sourcekitten we already have all the possible xcode paths in there, see the pr :).

@johnfairh
Copy link
Collaborator

Libclang: see here.

@galli-leo
Copy link

Thought I would post a quick update here. I got the SourceKitten implementation mostly working (i.e. parse doc comments and declarations). Additionally, I hacked jazzy to work with the new SourceKitten implementation and a live demo can be found here: http://galli.me/jazzy-demo/index.html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants