Skip to content
This repository has been archived by the owner on Jun 7, 2022. It is now read-only.

Deprecated API Endpoint (Invalid JSON data returned from network!) #69

Open
Sillson opened this issue Dec 22, 2020 · 23 comments
Open

Deprecated API Endpoint (Invalid JSON data returned from network!) #69

Sillson opened this issue Dec 22, 2020 · 23 comments
Assignees

Comments

@Sillson
Copy link

Sillson commented Dec 22, 2020

Hey @ReagentX

Heads up that the fine folks at PurpleAir deprecated the use of the purpleair.com/json endpoint and developed their own API. https://api.purpleair.com/

My job requires me to process a large amount of PurpleAir data quickly, and at a high frequency. I was going to spike out on a new package.

@ReagentX
Copy link
Owner

Last I checked, there was no way to get API keys yet.

@Sillson
Copy link
Author

Sillson commented Dec 22, 2020

Are you planning on retooling this package to support authenticating via a user's token, plus utilizing the new endpoints/responses? It's a fairly seismic shift.

@ReagentX
Copy link
Owner

ReagentX commented Dec 22, 2020

The new API does not provide ThingSpeak IDs, so I am not sure how much of this package can be reused.

I would like this package to continue to work, however, but I need the new API to work first before I can try to make it compatible.

@Sillson
Copy link
Author

Sillson commented Dec 22, 2020

@ReagentX Yes the new API has ThingSpeak IDs in the individual sensor channel. Again though, it is a large shift and would require an entire overhaul. Unfortunately I would know, I'm currently overhauling my own PurpleAir processing streams.

@thomastu
Copy link

Chiming in on this issue - it looks like key registration occurs via email. I was able to get a key by emailing [email protected]

@ReagentX
Copy link
Owner

Is there a date when the old API shuts off?

@thomastu
Copy link

thomastu commented Feb 10, 2021

According to the google doc linked here: https://www.purpleair.com/sensorlist the timeline is "soon", but the document seems to have been created in 2017. It might be worth dropping an email to ask.

NOTICE: PurpleAir is migrating all users to our new API at https://api.purpleair.com. Please email [email protected] to get a key to use it and you should know that we will be preventing access to the current url’s at www.purpleair.com/data.json and www.purpleair.com/json soon

@dulitz
Copy link

dulitz commented Jul 13, 2021

The JSON endpoint is unreliable for me now -- it is returning data for my sensor that doesn't match what's shown on the website (or, as far as I can tell, reality).

Is someone working on this? I haven't been using ThingSpeak at all, just polling the JSON, and intend to continue not to use ThingSpeak, so I'm not the ideal candidate to patch this up.

6 months ago when I first looked at PurpleAir's JSON document I don't remember seeing any deprecation section, but they now have big red text at the top saying we gotta switch to the new API real soon now.

@Sillson
Copy link
Author

Sillson commented Jul 13, 2021

@dulitz I've been using the new API for the past several months -- and have worked with the PurpleAir team to add fields necessary to perform most actions/analysis from the sensors (synoptic) level querying. So far I've found it to be performant and very easy to use.

I have my own internal toolkit for the my lab that I've developed, but it's fairly specific to my needs. The PurpleAir API should be easy to implement. I was going to spike on a new package, but with the new API it felt unnecessary. feel free to HMU if you have any questions on how to use

@Sillson Sillson closed this as completed Jul 13, 2021
@ReagentX ReagentX reopened this Jul 13, 2021
@ReagentX
Copy link
Owner

ReagentX commented Jul 13, 2021

Keeping this issue open as I intend v2.0 to leverage the current API. That said, the unit tests still pass for me. I am not sure why @dulitz is receiving bad data from the network, but this library does not alter the data it receives.

The new API documentation still does not provide instructions on where one can acquire an api_key.

@ReagentX
Copy link
Owner

ReagentX commented Jul 13, 2021

Just for clarification–the point of this library is to provide a Pythonic interface to interact with air quality data.

Indexing with JSON/dict data is annoying and error prone. KeyErrors are annoying to debug and often lack context. This was my effort to make air quality data available to beginner programmers. One of the ways the library accomplishes this is ensuring errors are handled gracefully and explicitly so the user knows what went wrong without firing up a debugger.

If the data that powers this tool is no longer available to the public, then effort to develop this library only serves some institutions, not independent developers, and those institutions should not rely on free labor to power their research.

@dulitz
Copy link

dulitz commented Jul 13, 2021

The data that powers this tool, i.e. the source data of the PurpleAir monitors that PurpleAir manufactured and we purchased, is (as far as I know, and I haven't researched it deeply) as available to the public as it was before. It was available under a TOU before and it is available under a TOU now. AFAICT the issuance of an api key is automatic; they didn't ask me why I wanted one when I emailed to request one, and they didn't at that point ask me to agree to anything.

Of course that could change at any time. I just don't observe that a change has happened thus far.

To be clear, I'm not using this library, so any inaccurate JSON data (specifically for temperature) has nothing to do with this code. I had been wondering whether I should just roll my own client and I see that the answer is yes.

@Sillson
Copy link
Author

Sillson commented Jul 13, 2021

@ReagentX understood, and agreed that a library to help guide beginner users has value. As well as shortcuts to map/visualize/compare etc. One of the benefits of the API is that they introduced a sensor_index instead of having to rely on indexing JSON 👍. The value of the data may change if the API introduces a paywall -- but as I understand it, hosting an open JSON payload that has been abused was quite the engineering feat to keep up (and price).

@ReagentX
Copy link
Owner

Understood. @dulitz, I know this is unrelated to this ticket, but do you know if the new API provides accurate data for your sensor? If so, that would present a very obvious reason for me to overhaul this library, since there are institutions using it for real research.

@dulitz
Copy link

dulitz commented Jul 13, 2021

I don't know, but I expect to know within a week or so and I'll follow up here when I do.

@ReagentX
Copy link
Owner

Back on topic:

I received this reply from PurpleAir regarding the new API limitations:

The data from individual sensors will update no less than every 30 seconds. As a courtesy, we ask that you limit the number of requests to no more than once every 1 to 10 minutes, assuming you are only using the API to obtain data from sensors. If retrieving data from multiple sensors at once, please send a single request rather than individual requests in succession.

This requires some obvious retooling of the SensorList() class.

I'm going to use this ticket to track the changes I need to make, so you may want to unsubscribe from this thread.

@zdwhite
Copy link

zdwhite commented Jul 21, 2021

@ReagentX understood, and agreed that a library to help guide beginner users has value. As well as shortcuts to map/visualize/compare etc. One of the benefits of the API is that they introduced a sensor_index instead of having to rely on indexing JSON 👍. The value of the data may change if the API introduces a paywall -- but as I understand it, hosting an open JSON payload that has been abused was quite the engineering feat to keep up (and price).

Can you help me out a bit. I am struggling with using the sensor_index endpoint. I'm able to call individual sensors just fine with my API key put their documentation leaves much to the imagination on how to grab multiple sensors in a single call.

@dulitz
Copy link

dulitz commented Jul 21, 2021

This is off-topic, but since you asked. :) You can't use the sensor_index endpoint to retrieve data for multiple sensors at once; you use the sensors endpoint with the show_only parameter.

That's according to the docs; I haven't actually fetched from their API yet.

@zdwhite
Copy link

zdwhite commented Jul 21, 2021

This is off-topic, but since you asked. :) You can't use the sensor_index endpoint to retrieve data for multiple sensors at once; you use the sensors endpoint with the show_only parameter.

That's according to the docs; I haven't actually fetched from their API yet.

You're correct I miss spoke, trying to build the url query string by hand was silly. I just ended up using an http post method instead.

@dulitz
Copy link

dulitz commented Aug 30, 2021

@ReagentX I said I'd report back on the API accuracy with respect to temperature, and here I am, better late than never. It turns out that the issue isn't really the API, it's that the sensor reports raw data prior to correction. Their new API has this note in the documentation for the temperature field: "On average, [temperature inside the device housing] is 8F higher than ambient conditions."

So it looks like they subtract 8 degrees F before reporting the temperature on their website but they do not do this in either API. (I have a question in to them about whether they have a more sophisticated calibration model but have not yet heard back.)

So data accuracy is no reason to use the newer endpoint. However the newer endpoint is so similar to the old one that you could easily shim it into your code, and that might be good insurance against them doing what they've been saying they'd do for 18 months now (shutting off the old API).

You or others may have documented this discrepancy; it may be well known. But it was not known to me even though it was clear the temperature reported by the API was super inaccurate. :)

@brooketalcott
Copy link

Thanks for raising the issue @Sillson
I just got my API key from them and they confirmed, "The new API should be taking over literally any day now"

@ReagentX
Copy link
Owner

Historical data is moving to BigQuery: https://community.purpleair.com/t/purpleair-data-migration-to-bigquery-and-a-new-api/295

Since BigQuery has native support for DataFrames, I am not sure what value this library would provide beyond wrapping bigquery, which seems redundant.

@ReagentX ReagentX self-assigned this May 30, 2022
@ReagentX
Copy link
Owner

@ReagentX ReagentX pinned this issue May 30, 2022
@ReagentX ReagentX changed the title Deprecated API Endpoint Deprecated API Endpoint (Invalid JSON data returned from network!) May 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants