You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In some sites such as the linkedin transparency reports, the terms of interest are located in dynamically named endpoints that could for example be determined by time (e.g. October-2023-LinkedIn-DSA-Transparency-Report10.pdf). These dynamic endpoints of interest are in most cases located in fixed locations. It thus makes sense to introduce the new declaring term dynamic-fetch.
This term will fetch the document located on the dynamic endpoint dynamic-fetch.variable defined at dynamic-fetch.location. It will be complimentary to fetch.
It could potentially be defined as follows,
{
"name": "Linkedin",
"documents": {
"Transparency Ad Library": {
// This shall fetch the pdf doc at https://content.linkedin.com/content/dam/help/linkedin/en-us/October-2023-LinkedIn-DSA-Transparency-Report10.pdf"dynamic-fetch": {
"variable": "div[class=\"t-14 article-content__rich-text hue-default-color\"] > ul > li:first-child > a.getAttribute('href')",
"location": "https://www.linkedin.com/help/linkedin/answer/a1678508?hcppcid=search"
}
}
}
}
As I am pretty new to this tool, I would be happy to hear some feedback about this proposition!
If you share my vision, I would be happy to implement it :)
The text was updated successfully, but these errors were encountered:
Indeed, it happens sometimes that terms are only available as a downloadable file behind a link. The idea of obtaining the URL dynamically from the DOM is a smart answer to that problem 👍
The main question we need to answer to decide if it would be worth adding a new type of fetch is: are the location and DOM from which we obtain the link any more stable than the link itself? In the case at hand, DSA Transparency Reports are published every 6 months. We'd need to demonstrate that the location and DOM from which the link can be obtained change significantly less often than twice a year, otherwise the maintenance burden will be the same on collection maintainers, and we would have increased software complexity for nothing 😰
The next investigation steps I see are:
Identify at least 2 other cases where such a system would be used.
Measure with the Wayback Machine (or any other reliable history tool) how often the location or link selector changed (l) vs how often the target of the link changed (t) in at least the last 2 years.
If t > e ⨉ l, where e is some arbitrary multiplier encoding the effort it would take to implement this feature, we'll consider it 🙂
MattiSG
changed the title
New Declaring term: Dynamic Fetching
Dynamic Fetching
Mar 14, 2024
MattiSG
changed the title
Dynamic Fetching
Obtain location dynamically from link
Mar 14, 2024
In some sites such as the linkedin transparency reports, the terms of interest are located in dynamically named endpoints that could for example be determined by time (e.g. October-2023-LinkedIn-DSA-Transparency-Report10.pdf). These dynamic endpoints of interest are in most cases located in fixed locations. It thus makes sense to introduce the new declaring term
dynamic-fetch
.This term will fetch the document located on the dynamic endpoint
dynamic-fetch.variable
defined atdynamic-fetch.location
. It will be complimentary tofetch
.It could potentially be defined as follows,
As I am pretty new to this tool, I would be happy to hear some feedback about this proposition!
If you share my vision, I would be happy to implement it :)
The text was updated successfully, but these errors were encountered: