Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for ISO 4217 currency codes #28

Open
rthaenert opened this issue Jan 7, 2020 · 15 comments
Open

Support for ISO 4217 currency codes #28

rthaenert opened this issue Jan 7, 2020 · 15 comments

Comments

@rthaenert
Copy link

First of all: Nice library, thanks for creating it.

For converting between major currencies it would be nice to have the ISO 4217 code of the parsed price (EUR, USD, AUD, ...) as this is easier for handling exchange rates.

Is there any plan to support that?

@lopuhin
Copy link
Member

lopuhin commented Jan 8, 2020

This would be a great feature to have. Note that it's a bit tricky to implement, as relation between currency symbols and currency ISO codes is N:1, so we'll need to use other attribute like country to determine if $ means USD, AUD, HKD, SGD or other.

@rthaenert
Copy link
Author

Yes, there are many cases in which this mapping would result in more than one currency code.

Maybe its a good idea to provide all matching currency codes with the first element always being a
major currency code (the most used ones, basically the ones outlined in https://en.wikipedia.org/wiki/Currency_pair)?

Like this:

  € => [ "EUR" ]
  $ => [ "USD", "AUD", "CAD", ...]
AU$ => [ "AUD" ]
[...]

To decide between the different $'s the existing currency hint could be reused to get a precise mapping and for all cases in which it's unclear the list with all possible values should be good enough.

What do you think?

@lopuhin
Copy link
Member

lopuhin commented Jan 10, 2020

That's an interesting option which I didn't consider before. That would mean that the caller which has more info regarding the context would be able to select the best variant. And the caller which does not care much could take all or first. So it seems that this approach can work well. 👍 Also this looks quite future-proof to me.

@Gallaecio
Copy link
Member

An alternative/complementary approach would be Dateparser’s, where users pass a locale to the parser, and the parser returns a value based on the specified locale.

@rpalsaxena
Copy link
Contributor

@Gallaecio @lopuhin Is there any update regarding this feature? If it's in development, would love to contribute. :)

@Gallaecio
Copy link
Member

I don’t think there is anyone working on it at the moment.

@Akay7
Copy link

Akay7 commented Mar 1, 2020

As suggest @Gallaecio it will be nicer if every locale will be able to redefine currency symbols.

  • $ without locale -> USD(default)
  • $ Singapore locale -> SGD(redefined just for this locale)
  • $ Canadian locale -> CAD(redefined just for this locale)
  • $ Russian locale -> USD(used default relation)

If no one work on this, I can start work on this issue.

@Gallaecio
Copy link
Member

There’s no pull request open so far, so feel free to go ahead.

@ivanprado
Copy link

FWIW List of circulating currencies: https://en.wikipedia.org/wiki/List_of_circulating_currencies and the support of currencies and locales in Babel: http://babel.pocoo.org/en/latest/api/numbers.html

@ivsanro1
Copy link
Contributor

ivsanro1 commented Oct 11, 2022

There's a current implementation of this that I could add via PR.

This implementation works as follows:

  1. Given a input currency string (e.g. $, US$), it makes a fuzzy search (using python-Levenshtein) to select the best matching currency in a database.
  2. The currency codes of top matching currency(ies) are selected as "candidates" (they're in the database too). For example, for $ we'd have ['USD', 'CAD', 'AUD', ...], but for US$ we'd only have ['USD'] as candidates.
  3. We run a series of "disambiguation methods" to reduce the candidates list as much as possible. These disambiguation methods require additional external information like the plain text of the html of the webpage, the url, etc. This can greatly vary depending on the user's context.

The steps 1 and 2 could be added to price-parser, and it would not require further input from the user, i.e. it would not change the API:

>>> Price.fromstring('1200 $')
Price(amount=Decimal('1200'), currency='$', currency_codes=['USD', 'CAD', 'AUD', ...])

>>> Price.fromstring('1200 US$')
Price(amount=Decimal('1200'), currency='US$', currency_codes=['USD'])

The step 3 is a little more tricky, as it would require more inputs from the user.

Some examples of how the API could be:

# `hint_text` would be intended to use mainly with plain HTML
Price.fromstring('1200 $', hint_text='<html><body>... currency="USD"...</body></html>')
Price(amount=Decimal('1200'), currency='$', currency_codes=['USD'])

Price.fromstring('1200 $', hint_url='www.example.ca')
Price(amount=Decimal('1200'), currency='$', currency_codes=['CAD'])

However, in my opinion, this is beyond the scope of price-parser, I'd go for integrating only 1 and 2, and the user would have its own way of selecting from the candidates list, as @lopuhin pointed out, since they'd have more context about their problem.

Additionally, I wanted to point out that price-parser sometimes does not find the currency, especially when it's not "standard", here are some examples:

>>> Price.fromstring('1200 SFr')  # SFr is Swiss Franc. Currency code: CHF
Price(amount=Decimal('1200'), currency=None)

>>> Price.fromstring('1200 kz')  # "kz" is Angolan Kwanza. Currency code: AOA
Price(amount=Decimal('1200'), currency=None)

>>> Price.fromstring('دينار 1000')  # "دينار" is Bahraini dinar. Currency code: BHD
Price(amount=Decimal('1000'), currency=None)

>>> Price.fromstring('1000 BTC')  # "BTC" is Bitcoin. Currency code: BTC, although not part of ISO 4217, but widely adopted  
Price(amount=Decimal('1000'), currency=None)

So, unfortunately, the fuzzy search won't be so useful, as it's intended for when the currency can be less standard, and for finding currencies in a more robust way. The drawback of it is obvious: it can find wrong matches, especially because we don't use a similarity threshold to define "far matches" that should not be used.

We have three options here:

  • Give in the fuzzy search feature, also, python-Levenshtein would not be a dependency
  • Keep the feature but keep price-parser as it is. In some cases this feature could still be useful, but less than it could since price-parser would not find less typical currencies.
  • Keep the feature and make price-parser find less typical currencies. Maybe this could also be an additional parameter to add to the API, so that if it does not find a currency with the current method, it tries to make a heuristic search (we could even use the fuzzy search for this and kill two birds with one stone)

@lopuhin
Copy link
Member

lopuhin commented Oct 11, 2022

Thank you @ivsanro1 , an early comment on one point of your proposal

However, in my opinion, this is beyond the scope of price-parser, I'd go for integrating only 1 and 2, and the user would have its own way of selecting from the candidates list

To me it disambiguation also looks useful, as price parser is probably often used in web data extraction context, when these hints make sense. In terms of the API, it could be the same, but the list of currencies could be smaller.

Also regarding the API, if we add the currency_codes attribute to Price, it also makes sense to add a currency_code property which would be non-empty in case this list has one element, to simplify the usage.

@lopuhin
Copy link
Member

lopuhin commented Oct 11, 2022

@ivsanro1 regarding your last question,

Keep the feature and make price-parser find less typical currencies.

Looks best to me, but this can also be a different issue and a different PR. Even in current state the fuzzy matching looks useful as we can pass the currency_hint to Price.fromstring.

@kmike
Copy link
Member

kmike commented Oct 18, 2022

Hey! Could you please elaborate, why is fuzzy search needed here? I wonder if it'd be better to hardcode more currency variations. Or is it problematic for some reason?

@ivsanro1
Copy link
Contributor

Fuzzy search is only needed if we want to allow for non-exact matches. However, hardcoding the variations is also a perfectly valid approach and we would not have to worry about false positives (or at least as many as we could potentially have with fuzzy search).

In any case, it's slightly unrelated for the currency_code (sorry for that), I just mentioned it because it's related with the implementation I was describing.

@umrashrf
Copy link

umrashrf commented Mar 3, 2024

My use is to use price-parser with Stripe amount and currency and it requires 3 digit ISO currency code instead of 2 digit $. https://docs.stripe.com/currencies?presentment-currency=MX

Right now I have to use this code.

def fix_currency(currency):
    # TODO: Use this gist https://gist.github.com/jylopez/ba16be2ae55282d5cff07de65128de83
    if currency == "MX$":
        return currency.replace("MX$", "MXN")
    elif currency == "C$":
        return currency.replace("C$", "CAD")
    elif currency == "$":
        return currency.replace("$", "USD")
    else:
        return currency

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants