-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong parsing of numeric values as part of units #383
Comments
Sure, in
:) Basically, if you use it, use it in the same way as |
Thanks. I read the help for |
Wow, that's some unit! :) Question: do you mean I'm not sure we meant the dot to imply anything. Probably it's just that parsing units is a complex problem, and we may just not consider a case like this. We'll need to check the parser and adapt it to support this. |
In fact, when there is an actual unit after the number, the problem is another: # this is wrong: two times 1.73 there
unclass(as_units("ml/min/1.73m^2"))
#> [1] 1
#> attr(,"units")
#> $numerator
#> [1] "ml"
#>
#> $denominator
#> [1] "1.73m" "1.73m" "min"
#>
#> attr(,"class")
#> [1] "symbolic_units" |
as_units()
treat the dot in a character string?
Some more digging: library(units)
#> udunits database from /usr/share/udunits/udunits2.xml
# supported by udunits2, but I think this is NOT what we want
units:::R_ut_format(units:::R_ut_parse("ml/min/1.73m^2"))
#> [1] "9.63391136801542e-09 m⁵·s⁻¹"
# supported too
units:::R_ut_format(units:::R_ut_parse("ml/min/1.73/m^2"))
#> [1] "9.63391136801542e-09 m·s⁻¹"
# avoid our parsing
x <- structure(
1, units = structure(
list(numerator = "ml/min/1.73/m^2", denominator = NULL),
class="symbolic_units"),
class="units")
# conversion works as expected
set_units(x, "m/s")
#> 9.633911e-09 [m/s]
# fine... in a way
unclass(as_units("ml/min/1.73/m^2"))
#> [1] 0.5780347
#> attr(,"units")
#> $numerator
#> [1] "ml"
#>
#> $denominator
#> [1] "m" "m" "min"
#>
#> attr(,"class")
#> [1] "symbolic_units"
# but we ignore the number
set_units(1, "ml/min/1.73/m^2")
#> Warning in `units<-.numeric`(`*tmp*`, value = as_units(value, ...)): numeric
#> value 0.578034682080925 is ignored in unit assignment
#> 1 [ml/m^2/min] |
@Enchufa2 To your question above, I think I mean |
I suspected that much, thanks for confirming. We'll look into this to support this use case. |
I think this is also related to the issue raised here: a negative exponent in a scientific notation is not parsed and throws an error: library(units)
#> udunits database from C:/ProgramData/R/win-library/4.3/units/share/udunits/udunits2.xml
set_units(1, "1e1g")
#> 1 [1e1g]
set_units(1, "1e+1g")
#> 1 [1e+1g]
set_units(1, "0.1g")
#> 1 [0.1g]
set_units(1, "1e-1g")
#> Error: cannot convert g into 1e
#> Did you try to supply a value in a context where a bare expression was expected? Created on 2025-03-05 with reprex v2.0.2 |
The fact that it gets through with a positive exponent is just an artifact. Scientific notation is not supported nor in the roadmap. |
OK, good to know, I will create a work-around for this myself for my purpose |
What's your use case, BTW? |
It's a package that creates a database from publicly available data (see link below). It includes tables with units for numerical data. However, these text fields are not standardised and contain a lot of annotations and inconsistensies. So I'm writing a function that sanitises these text fields, before parsing them with the units package. This is the code so far, but is work in progress: https://github.com/pepijn-devries/ECOTOXr/blob/units/R/process_unit.r The intention is to create a column with |
But how does the data look like? I'm asking because, if you are trying to parse data with units, |
Thanks for pointing this out. I think these parsers assume consistent formatting of the input. Whereas my input is a lot messier and not always consistent. That's why I do need to do some tidying before parsing. Below a random sample of 50 units in the database
|
I see... Yes, quantities' parsers have some flexibility, but otherwise assume consistency, so they cannot help here. |
It seems like strings representing integers cannot be coerced to units (except for 1) and "dot" means exponentiation (except after zero). Can you point me to the documentation? Thanks in advance.
The text was updated successfully, but these errors were encountered: