Skip to content

reeyarn/LeanRL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LeanRL

Lean (non-XML) approach to process XBRL

Python 3.9+ License: MIT

A lightweight, memory-efficient, and fast Python library for extracting specific information from XBRL filings and taxonomies — without loading the entire DTS (Discovery Tree).

Funding Acknowledgment (DFG): Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Collaborative Research Center (SFB/TRR) Project-ID 403041268 – TRR 266 Accounting for Transparency.

Motivation

XBRL is powerful but complex:

  • A single filing includes the instance document, company linkbases, and a huge taxonomy (hundreds of XML files)
  • Traditional XBRL processors load the full DTS into memory: slow and memory-intensive
  • In many real-world scenarios (data extraction, analysis, reporting), you only need a small subset of the data

LeanRL takes a pragmatic, non-strict approach:

  • Process one file at a time (no full DTS loading)
  • Extract only what you need into simple Python structures (dict, list, pandas.DataFrame)
  • Forget strict XBRL validation and complex object models — focus on speed and simplicity

Features

  • Parse presentation linkbases (build hierarchical trees, tables, roll-forwards)
  • Parse calculation linkbases (extract summation rules with weights)
  • Parse definition linkbases (dimensions, tables, axes); See: Documentation
  • Parse label linkbases (English/translated labels and documentation)
  • Parse reference linkbases (links to authoritative literature)
  • Parse taxonomy schema files (elements, types, from elts/, dis/, stm/)
  • Convert XBRL structures to pandas DataFrames or nested dictionaries

Install

To install released version, run: pip install leanrl

To install the latest development version from this github repo, run:

git clone https://github.com/reeyarn/LeanRL/
cd LeanRL
pip install -e .

or uv pip install -e ".[dev]"

Example

from leanrl import parse_label_linkbase, Roles

# Get documentation
path = "/tmp/us-gaap-2020-01-31/elts/"
#path = "LeanRL/tests/data/"

filename = "us-gaap-doc-2020-01-31.xml"

docs = parse_label_linkbase(path + filename)

for i, (concept, doc) in enumerate(docs.items()):
    print(f"{i}: {concept}: {doc}")
    if i > 32:
        break


# Get display labels
labels = parse_label_linkbase(path + 'us-gaap-lab-2020-01-31.xml', role=Roles.LABEL)


for i, (concept, label) in enumerate(labels.items()):
    print(f"{i}: {concept}: {label}")
    if i > 32:
        break

See test2.py and test3.py for latest added features

Output example:

concept label documentation reference data_type is_abstract period_type is_monetary balance all_statements all_disclosures depth path
12 us-gaap_AdmissionMember Admission [Member] Right or permission to enter. Includes, but is not limited to, entrance to park, ride, attraction, theater, sporting event, and movie. Reference(FASB Accounting Standards Codification Topic 606 Section 55) nonnum:domainItemType True duration False soi 5 [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > srt_ProductOrServiceAxis > srt_ProductsAndServicesDomain > us-gaap_AdmissionMember
120 us-gaap_AccidentAndHealthInsuranceSegmentMember Accident and Health Insurance Product Line [Member] Product line consisting of insurance against loss by illness or injury, including but not limited to medical, dental, disability, workmen's compensation and long-term care. Reference(FASB Accounting Standards Codification Topic 944 Section 55), Reference(FASB Accounting Standards Codification Topic 944 Section S99), Reference(FASB Accounting Standards Codification Topic 944 Section S99) nonnum:domainItemType True duration False soi-ins dis-sec-reins,dis-fs-insa,dis-fs-ins 5 [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > srt_ProductOrServiceAxis > srt_ProductsAndServicesDomain > us-gaap_AccidentAndHealthInsuranceSegmentMember
121 us-gaap_AccidentAndHealthInsuranceExcludingWorkersCompensationMember Accident and Health Insurance, Excluding Workers' Compensation [Member] Contract providing insurance coverage against loss by illness or injury. Includes, but is not limited to, medical, dental, disability, and long-term care. Excludes workers' compensation. Reference(FASB Accounting Standards Codification Topic 944 Section 55) nonnum:domainItemType True duration False soi-ins dis-fs-insa,dis-fs-ins 6 [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > srt_ProductOrServiceAxis > srt_ProductsAndServicesDomain > us-gaap_AccidentAndHealthInsuranceSegmentMember > us-gaap_AccidentAndHealthInsuranceExcludingWorkersCompensationMember
347 us-gaap_AccretionExpenseIncludingAssetRetirementObligations Accretion Expense, Including Asset Retirement Obligations Amount of accretion expense, which includes, but is not limited to, accretion expense from asset retirement obligations, environmental remediation obligations, and other contingencies. xbrli:monetaryItemType False duration True debit soi-sbi,soi-egm,soi-re,scf-sbo,scf-indir,scf-dbo,soi-int,scf-re,soi,soi-reit 13 [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > us-gaap_StatementLineItems > us-gaap_PartnershipIncomeAbstract > us-gaap_IncomeLossAttributableToParentAbstract > us-gaap_IncomeLossIncludingPortionAttributableToNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesMinorityInterestAndIncomeLossFromEquityMethodInvestmentsAbstract > us-gaap_OperatingIncomeLossAbstract > us-gaap_OperatingExpensesAbstract > us-gaap_OperatingCostsAndExpensesAbstract > us-gaap_AccretionExpenseIncludingAssetRetirementObligationsAbstract > us-gaap_AccretionExpenseIncludingAssetRetirementObligations
348 us-gaap_AccretionExpenseIncludingAssetRetirementObligationsAbstract Accretion Expense, Including Asset Retirement Obligations [Abstract] xbrli:stringItemType True duration False soi-sbi,soi-egm,soi-re,scf-sbo,scf-indir,scf-dbo,soi-int,scf-re,soi,soi-reit 12 [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > us-gaap_StatementLineItems > us-gaap_PartnershipIncomeAbstract > us-gaap_IncomeLossAttributableToParentAbstract > us-gaap_IncomeLossIncludingPortionAttributableToNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesMinorityInterestAndIncomeLossFromEquityMethodInvestmentsAbstract > us-gaap_OperatingIncomeLossAbstract > us-gaap_OperatingExpensesAbstract > us-gaap_OperatingCostsAndExpensesAbstract > us-gaap_AccretionExpenseIncludingAssetRetirementObligationsAbstract
350 us-gaap_AccretionExpense Accretion Expense Amount recognized for the passage of time, typically for liabilities, that have been discounted to their net present values. Excludes accretion associated with asset retirement obligations. Reference(FASB Accounting Standards Codification Topic 410 Section 45), Reference(FASB Accounting Standards Codification Topic 420 Section 35) xbrli:monetaryItemType False duration True debit soi-sbi,soi-egm,soi-re,scf-sbo,scf-indir,scf-dbo,soi-int,scf-re,soi,soi-reit 13 [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > us-gaap_StatementLineItems > us-gaap_PartnershipIncomeAbstract > us-gaap_IncomeLossAttributableToParentAbstract > us-gaap_IncomeLossIncludingPortionAttributableToNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesMinorityInterestAndIncomeLossFromEquityMethodInvestmentsAbstract > us-gaap_OperatingIncomeLossAbstract > us-gaap_OperatingExpensesAbstract > us-gaap_OperatingCostsAndExpensesAbstract > us-gaap_AccretionExpenseIncludingAssetRetirementObligationsAbstract > us-gaap_AccretionExpense
603 us-gaap_AdministrativeServiceMember Administrative Service [Member] Administrative assistance, including, but not limited to, accounting, tax, legal, regulatory filing, and share registration of managed fund and investment account of independent third party, and related and affiliated entity. Reference(FASB Accounting Standards Codification Topic 606 Section 55) nonnum:domainItemType True duration False soi 8 [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > srt_ProductOrServiceAxis > srt_ProductsAndServicesDomain > us-gaap_ServiceMember > us-gaap_FinancialServiceMember > us-gaap_InvestmentAdvisoryManagementAndAdministrativeServiceMember > us-gaap_AdministrativeServiceMember
624 us-gaap_AdvertisingExpense Advertising Expense Amount charged to advertising expense for the period, which are expenses incurred with the objective of increasing revenue for a specified brand, product or product line. Reference(FASB Accounting Standards Codification Topic 720 Section 50) xbrli:monetaryItemType False duration True debit soi-ins,soi-sbi,soi-egm,soi-re,soi-int,soi,soi-reit 14 [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > us-gaap_StatementLineItems > us-gaap_PartnershipIncomeAbstract > us-gaap_IncomeLossAttributableToParentAbstract > us-gaap_IncomeLossIncludingPortionAttributableToNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesMinorityInterestAndIncomeLossFromEquityMethodInvestmentsAbstract > us-gaap_OperatingIncomeLossAbstract > us-gaap_OperatingExpensesAbstract > us-gaap_SellingGeneralAndAdministrativeExpenseAbstract > us-gaap_SellingAndMarketingExpenseAbstract > us-gaap_MarketingAndAdvertisingExpenseAbstract > us-gaap_AdvertisingExpense
632 us-gaap_AdvertisingMember Advertising [Member] Announcement promoting product, service, or event. Reference(FASB Accounting Standards Codification Topic 606 Section 55) nonnum:domainItemType True duration False soi 5 [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > srt_ProductOrServiceAxis > srt_ProductsAndServicesDomain > us-gaap_AdvertisingMember
636 us-gaap_AffiliateCosts Affiliate Costs Costs associated with revenues arising from an entity that is an affiliate of the reporting entity by means of direct or indirect ownership. Reference(FASB Accounting Standards Codification Topic 220 Section S99), Reference(FASB Accounting Standards Codification Topic 235 Section S99) xbrli:monetaryItemType False duration True debit soi-egm,soi-re,soi,soi-reit 14 [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > us-gaap_StatementLineItems > us-gaap_PartnershipIncomeAbstract > us-gaap_IncomeLossAttributableToParentAbstract > us-gaap_IncomeLossIncludingPortionAttributableToNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesMinorityInterestAndIncomeLossFromEquityMethodInvestmentsAbstract > us-gaap_OperatingIncomeLossAbstract > us-gaap_GrossProfitAbstract > us-gaap_CostOfRevenueAbstract > us-gaap_CostOfGoodsAndServicesSoldAbstract > us-gaap_CostOfGoodsAndServiceExcludingDepreciationDepletionAndAmortizationAbstract > us-gaap_AffiliateCosts

Project Structure

leanrl/
├── src/leanrl/
│   ├── core/
│   │   ├── namespaces.py   # qname(), Roles, NS_LINK, etc.
│   │   ├── parser.py   
│   │   └── streaming.py    # stream_xml()
│   ├── utils/
│   │   └── href.py         # extract_concept_from_href()
|   linkbases/
│   │   ├── __init__.py
│   │   ├── label.py              # Label linkbase only
│   │   ├── reference.py          # Reference linkbase only
│   │   ├── calculation.py        # Calculation linkbase only
│   │   ├── hierarchy.py          # Shared ConceptNode, ConceptTree (used by def & pre)
│   │   ├── definition.py         # Definition linkbase only (imports from hierarchy)
│   │   ├── presentation.py       # Presentation linkbase only (imports from hierarchy)
│   │   └── README.md             # Linkbase documentation
│   ├── taxonomy/
│   │   ├── __init__.py
│   │   └── schema.py             # Taxonomy schema parser
│   └── instance/                 # (reserved for future instance document parsing)
├── tests/
│   ├── test1.py
│   ├── test2.py
│   └── data/                     # Sample taxonomy files
└── docs/

Attribution & Legal Notices

ESEF Standard Acknowledgment

This project supports the European Single Electronic Format (ESEF), established by the European Securities and Markets Authority (ESMA) as the mandated digital reporting standard for annual financial reports of listed companies in the European Union. The ESEF specifications and guidelines are sourced from ESMA’s official publications and are adhered to in this implementation. For more information, visit esma.europa.eu.

IFRS Taxonomy & ESEF Standards

This project supports the processing of filings based on the International Financial Reporting Standards (IFRS) and the European Single Electronic Format (ESEF).

IFRS Taxonomy The IFRS Taxonomy is developed and maintained by the IFRS Foundation. The taxonomy files included or referenced in this project are sourced from the IFRS Foundation’s official repository.

  • Copyright: The IFRS Taxonomy is Copyright © IFRS Foundation. All rights reserved.
  • Disclaimer: This project is an open-source tool and is not affiliated with, endorsed by, or commercially licensed by the IFRS Foundation. The files are used solely to facilitate the technical validation and creation of XBRL/iXBRL documents. For official standards, please visit ifrs.org.

ESEF Guidelines The ESEF reporting standard is established by the European Securities and Markets Authority (ESMA) for listed companies in the European Union.

  • Source: ESEF specifications are sourced from ESMA’s official publications.
  • Attribution: Adherence to ESEF guidelines in this project is based on public technical standards available at esma.europa.eu.

US GAAP Taxonomy Acknowledgment & License

This project includes copies of the US GAAP Financial Reporting Taxonomy (e.g., us-gaap-YYYY-MM-DD.xsd), sourced from official locations (e.g., fasb.org and xbrl.us). These files are Copyright © Financial Accounting Foundation (FAF) and, for certain prior versions, XBRL US, Inc.

The taxonomy files are redistributed within this project as a "Permitted Work" pursuant to the FAF's Copyright Notice and policies. They are provided for public use to assist in the implementation and processing of XBRL data.

Compliance Conditions:

  1. Non-Modification: All original copyright notices, XML comments, disclaimers, and license statements embedded in the taxonomy files have been preserved unchanged.
  2. No Ownership Claim: This project does not claim ownership of the taxonomy; rights remain exclusively with the FAF and XBRL US.
  3. Authorized Use: Use of these files is subject to the Notice of Authorized Uses maintained by the FAF.

For full license terms, please see the Official Terms and Conditions.

General Disclaimer & Takedown Notice

The use of the standards, taxonomies, and schemas listed above is intended to support educational and research purposes in alignment with the open-source goals of this project.

Rights Infringement Contact: If any use herein is found to infringe upon the rights of the FASB, XBRL US, ESMA, or the IFRS Foundation, please contact the author immediately:

Contact: [email protected]

Upon receipt of a valid notice, the author will promptly remove or adjust the offending content to address any concerns.

About

Easy (non-XML) approach to process XBRL

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages