Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a dummy-ups script to examples? #2807

Open
sshambar opened this issue Feb 17, 2025 · 14 comments
Open

Adding a dummy-ups script to examples? #2807

sshambar opened this issue Feb 17, 2025 · 14 comments

Comments

@sshambar
Copy link
Contributor

I wrote a bash script to integrate my local Enphase's Envoy API (a simple JSON based local web API) with a dummy-ups driver's port file. I've just published it (GPLv3) at https://github.com/sshambar/enphase-monitor

The script is simple to use, and provides information on "grid" online/offline status, battery SOC, and several other details (like "load" and "runtime") from the Enphase system. I use it to integrate with current UPS + ATS (automatic transfer switch) to treat my house battery system as 2nd UPS. I didn't feel it merited writing a full driver as I'm not using it for FSD, outlet control or anything "interactive", I just didn't want my servers shutting down when my UPS failed (don't ask) and the house still had backup power. Plus it's nice to have another way to monitor the Enphase system using the NUT cgi programs...

The script is probably something others could use as an basis for integrating with similar web APIs using the dummy-ups as a hook to the NUT system, so I thought perhaps it made sense to add as an "example" script someplace in the source?

The script has nice features such as:

  • retains existing/extra values in the port file
  • gracefully handles split-phase or 3-phase input/output values
  • handles no-comms with temporary rename of the port file (indicates "stale data")
  • support portfile value customization (things like battery voltage range, or battery.charge.low for LB trigger)
  • contains logic for login and token based API authentication, with automatic token renewal
  • has it's own config for authentication and API query timing etc
  • checks permissions on files that use secrets
  • is fully self-documented (leading comment in the script)
  • GPLv3 licensed
  • minimal requirements: bash, jq, base64 and curl
  • has details on setting up a systemd service to use with dummy-ups, and SELinux config
  • includes a "TEST" mode that loops through various states and randomly expires the token
  • ... and it's pretty well tested :)

I would have rather created a pull request against NUT, but I honestly don't know where I would add the file here... perhaps someone can suggest a directory if you think it might be useful.

In any case, anyone wanting to add Enphase local API support to NUT can use it as is.

PS. yes I know there's a "external" Enphase API, but it's request limits are severely restricted without paying a high fee...

@jimklimov
Copy link
Member

Sounds interesting, thanks!

Some new entry under the scripts/ directory sounds right for this. You would need also a README.adoc to describe this script (e.g. with text above as a starting point), and a Makefile.am to EXTRA_DIST the new files and spellcheck the document. If it is about a couple of new files, maybe add the names to scripts/Makefile.am here and there.

Also a NEWS.adoc entry would be nice :)

Tech-wise, it may be hard to specify Data stale'ness with a state file like this, e.g. if the API client script dies off, but otherwise the idea seems viable for a small-scale PoC at least. NUT drivers started out with files to talk to upsd way back when :)

@sshambar
Copy link
Contributor Author

Some new entry under the scripts/ directory sounds right for this. You would need also a README.adoc to describe this script (e.g. with text above as a starting point), and a Makefile.am to EXTRA_DIST the new files and spellcheck the document. If it is about a couple of new files, maybe add the names to scripts/Makefile.am here and there.

I can probably blend the description above with the existing README from the file comments... I'll have to research the .adoc format :) (btw, where there spelling mistakes I missed?)

Tech-wise, it may be hard to specify Data stale'ness with a state file like this, e.g. if the API client script dies off, but otherwise the idea seems viable for a small-scale PoC at least. NUT drivers started out with files to talk to upsd way back when :)

Yeah, the staleness was really designed to reflect the network connectivity to the API server, not the liveliness of the monitor, but that did give me a couple ideas:

A. I should probably have the monitor start "before" the dummy-ups driver with a ExecStartPre mode to disable the port-file, so any pre-shutdown (and now quite stale) state isn't exposed at startup, and connectivity is "proved" before a real ups.status is made available.

B. I could potentially offer a pull request for dummy-ups with a new "keyword" (TBD named, "ACTIVE_TIMEOUT"?) which would cause the driver to claim data "stale" if the modify-time wasn't changed by the stated timeout, proving an active file manager. Useful?

Unrelated, I noticed that to be "net-protocol" consistent, the float numbers need to be in "POSIX" format, and running the monitor in some locales (eg da_DK) would result in incorrect radix characters; so I'm fixing that...and I should map non-POSIX radix characters retrieved from the API as well...

@jimklimov
Copy link
Member

Regarding "idea B", sounds interesting.

I think one way for it would be to addvar() a maxfileage option, to be terminology-consistent with a number of other tools (so staleness would be based on "now minus timestamp of the file"), with corresponding update for the docs/man/dummy-ups.txt man page that this is a feature for whoever wants to integrate external monitoring scripts as data feeds for NUT.

But given the dynamic nature of the driver and this use-case (with external data-source's change frequency maybe also a variable factor), a new keyword for DEV/SEQ file format could also be a correct way. Still, probably best done as a no-op hint for the dummy-ups parser ("starting from this moment until further notice, maxfileage is X").

In fact, both approaches can be combined, for the driver command-line argument (via addvar()) to provide a starting/default value, and scripts that have a need to do so can adjust it at run-time.

A negative value (also as built-in default for the introduced C variable), would mean ignoring the file age and never becoming "Data stale" (at least not due to its non-modification). This way existing use-cases that loop over the same sample file would never be stale.

Hm... while at it, another new keyword for DEV/SEQ file format, to inject and revert the staleness report can also be helpful for uses of dummy-ups in NUT integration testing and similar use-cases, e.g.:

DATASTALE on|yes|true|1
TIMER 60
DATASTALE off|no|false|0

@jimklimov
Copy link
Member

Regarding the "idea A", I suppose the "monitor" here is your script that polls the external data source and populates the NUT driver with information? I suggest you wrap it into a systemd unit of its own, for a number of reasons explored below, including:

  • avoid simultaneous runs
  • define dependencies (needs networking, required before NUT driver as its data consumer)
  • it may also be interesting to explore adding a timer unit for regular updates, as an alternative to looping - may be good if the updates are not frequent, or bad if there's a large overhead to spawn the interpreter, open files, etc. If the timer unit is a good fit, you can also see when it last run (or failed to - and perhaps why), but maybe it would be trickier to define it as a dependency for the NUT driver.

Then also note that systemd's "Wants/Requires" in a drop-in file for the [email protected] (or better "WantedBy/RequiredBy" referencing the driver unit in your script's service) and respective "After" (or "Before") should do the trick and cases like this are one of the reasons the nut-driver-enumerator (NDE, see Wiki for details) appeared to wrap all devices' drivers into dedicated systemd/SMF instances which can have individual dependency on whatever (or be a dependency of whatever).

  • the choice of Wants/Requires" boils down to whether you want the NUT driver to fail/stop when the monitoring script does, or to go on reporting - perhaps that the data is stale based on file age?

I can't vouch at the moment that NDE would not remove the whole directory with drop-in setting tuning for a particular driver instance when it decides the unit no longer exists (ups.conf section renamed, complete unit regeneration is requested, scripting bugs) so suggest that referencing the driver unit in your script's service unit is better :)

Keeping that file in a tmpfs (e.g. /dev/shm or under /run - perhaps in a subdirectory writeable to the non-root monitoring script and readable by nut) would help avoid irrelevant initial contents when the system is rebooted, at least. Would not help stopping the services for a while and restarting them though - that's what the dependencies can be good for (as well as max file age, or the ExecStartPre trick - as long as you can avoid two copies of the script running at the same unfortunate moment and garbling the file, so wrapping it into a dedicated service is also good for that).

@sshambar
Copy link
Contributor Author

... I suggest you wrap it into a systemd unit of its own,...

This was how it was designed to work (see INSTALL section of the README), but I've made it a bit more explicit by checking in the service template file now :)

I've also added a special mode for ExecStartPre, and ordered its service "Before" the driver; the "Pre" script can either be run to just "set nocomms", or as a "one-shot" to update the driver values before the driver starts. The main ExecStart then just runs in "monitor" mode and keeps the driver updated (and will auto-restart on failure)

I can't vouch at the moment that NDE would not remove the whole directory with drop-in setting tuning for a particular driver instance when it decides the unit no longer exists (ups.conf section renamed, complete unit regeneration is requested, scripting bugs) so suggest that referencing the driver unit in your script's service unit is better :)

The service unit does reference the driver directly in [Install]WantedBy=, but that needs to be "enabled" to take effect. I'll have to check on what NDE does when someone removes/re-adds the ups.conf section to see if the [email protected]/ directory is removed...

Keeping that file in a tmpfs (e.g. /dev/shm or under /run - perhaps in a subdirectory writeable to the non-root monitoring script and readable by nut) would help avoid irrelevant initial contents when the system is rebooted, at least.

I considered using /run when first writing the script, but there's enough configuration in the "port file" (battery.charge.low, battery.voltage.high etc) that I didn't want to lose that state on reboot... The ExecStartPre mode (above) should take care of the "staleness" problem though...

@sshambar
Copy link
Contributor Author

Regarding "idea B", sounds interesting.

... a maxfileage option ... a new keyword for DEV/SEQ file format ..., both approaches can be combined,

I like the idea of a default with a keyword override... (just MAXFILEAGE to be consistent?)

This would add a new "mode" for dummy-ups to operate in, as it currently just "exits" when the port file is removed to indicate staleness, whereas this new mode would allow it to remain running after calling the appropriate dstate_datastale() function -- which would have the further benefit of allowing the dummy driver to recover from a "nocomms" state much faster (currently it's delayed until systemd restarts the dummy-ups driver service)

Hm... while at it, another new keyword for DEV/SEQ file format,...

DATASTALE on|yes|true|1
TIMER 60
DATASTALE off|no|false|0

Yes, that's an excellent idea... but that also might suggest another "WATCH_TIMER " like keyword to enable a "watchdog" timer that wait unless the file was modified, so

WATCH_TIMER 60
DATASTALE on|yes|true|1

would result in staleness if the file wasn't updated... (TIMER could be updated to do this, but that would change current behavior).

Either option would work, and could even co-exist...

@jimklimov
Copy link
Member

jimklimov commented Feb 22, 2025

The WATCH_TIMER also makes sense, but check code first: IIRC TIMER sleeps in 1-second increments (to help timely graceful exits when asked to); maybe it already checks if the file changed - or that's a point where the new keyword might do so (and original TIMER would not).

just MAXFILEAGE to be consistent?

for a DEV/SEQ keyword, yes :)

CLI/ups.conf options are (mostly?) lower-case

the [email protected]/ directory

I'd expect it (for drop-ins) and systemd (for dependencies) to mostly manipulate individual units' directories, e.g. [email protected]/ or [email protected]/ directory names.

Likewise, dependencies declared from the monitoring script's unit would have to spell out the unit name for the driver. It defaults to quoting the section name from ups.conf, but if that string is invalid for the framework (systemd/SMF) it would use MD5 hash of that. I think the scripts (nut-driver-enumerator and upsdrvsvcctl) expose enough ways to check the mapping of driver sections vs. unit instances. But that would help in active (re-)set-up of dependencies for your monitor unit (manually or scripted via e.g. drop-in files), but not as something systemd could call and evaluate on the fly, I suppose.

@jimklimov
Copy link
Member

jimklimov commented Feb 23, 2025

One more idea comes to mind: expose the timestamp of latest data refresh. Not sure if it is a security issue to show that server info though. Maybe the time since driver start is better in this regard (some value changing or not), or making it a global/device driver option.

I'll think about a centralized solution for this idea (posted as #2810), but in the meanwhile your script could populate an experimental.timestamp or similar field in the text file that dummy-ups would publish, if that helps you somehow.

@jimklimov
Copy link
Member

Unrelated, I noticed that to be "net-protocol" consistent, the float numbers need to be in "POSIX" format, and running the monitor in some locales (eg da_DK) would result in incorrect radix characters; so I'm fixing that...and I should map non-POSIX radix characters retrieved from the API as well...

I wonder... do we have any precedents for extended POSIX numbers like 1.04e+7? Should these be valid?

@sshambar
Copy link
Contributor Author

I wonder... do we have any precedents for extended POSIX numbers like 1.04e+7? Should these be valid?

net-protocol.txt defines the number format pretty clearly:

Note that float values are expressed using decimal (base 10) english-based
representation, so using a dot, in non-scientific notation.  So hexadecimal,
exponents, and comma for thousands separator are forbidden.
For example: "1200.20" is valid, while "1,200.20" and "1200,20" and "1.2e4"
are invalid.

@jimklimov
Copy link
Member

Yeah, I first (a few days ago while commuting) read your comment as that the documentation was not consistent, so went digging into code to look up actual formatting string patterns used :)

FWIW also updated the spec in docs/nut-names.txt regarding the numeric format in values, which is another place one can look for such info.

jimklimov added a commit to jimklimov/nut that referenced this issue Feb 23, 2025
jimklimov added a commit to jimklimov/nut that referenced this issue Feb 23, 2025
jimklimov added a commit to jimklimov/nut that referenced this issue Feb 23, 2025
…om "%g" to "%f" to match NUT naming spec [networkupstools#2807]

Signed-off-by: Jim Klimov <[email protected]>
@sshambar
Copy link
Contributor Author

I can't vouch at the moment that NDE would not remove the whole directory with drop-in setting tuning for a particular driver instance when it decides the unit no longer exists ...

I tested this... it appears that the [Install]WantedBy= is effectively "owned" by the [email protected] instance, and not affected by any "disable/enable" on the [email protected] instances. Therefore, editing ups.conf and triggering NDE doesn't impact the dependency (which is nice :)

@sshambar
Copy link
Contributor Author

Submitted a draft PR #2813 with a new directory scripts/external_apis and the enphase script/README etc in a subdirectory.

Looking for comments:

  • if you'd like a different directory name, or would prefer a standalone directory directly below scripts/
  • added a short README for the external_apis directory, and a "full" README in the enphase subdirectory
  • didn't add the "full" README to spellcheck, and it has lots of "shell-command" like things that don't check well...

I figured I'd checkin the current script as is, and if we add the new "stale file" features to dummy-ups, then I'll submit those additions in a later version as an option :)

@jimklimov
Copy link
Member

PR looks good, I've posted some comments and updates there, and generally hope to merge soon (as it passes CI).

Makes sense to add new features holistically as different change sets striking in all relevant areas (driver, docs, maybe config parsing bindings) as one PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants