Lean (non-XML) approach to process XBRL
A lightweight, memory-efficient, and fast Python library for extracting specific information from XBRL filings and taxonomies — without loading the entire DTS (Discovery Tree).
Funding Acknowledgment (DFG): Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Collaborative Research Center (SFB/TRR) Project-ID 403041268 – TRR 266 Accounting for Transparency.
XBRL is powerful but complex:
- A single filing includes the instance document, company linkbases, and a huge taxonomy (hundreds of XML files)
- Traditional XBRL processors load the full DTS into memory: slow and memory-intensive
- In many real-world scenarios (data extraction, analysis, reporting), you only need a small subset of the data
LeanRL takes a pragmatic, non-strict approach:
- Process one file at a time (no full DTS loading)
- Extract only what you need into simple Python structures (
dict,list,pandas.DataFrame) - Forget strict XBRL validation and complex object models — focus on speed and simplicity
- Parse presentation linkbases (build hierarchical trees, tables, roll-forwards)
- Parse calculation linkbases (extract summation rules with weights)
- Parse definition linkbases (dimensions, tables, axes); See: Documentation
- Parse label linkbases (English/translated labels and documentation)
- Parse reference linkbases (links to authoritative literature)
- Parse taxonomy schema files (elements, types, from
elts/,dis/,stm/) - Convert XBRL structures to pandas DataFrames or nested dictionaries
To install released version, run:
pip install leanrl
To install the latest development version from this github repo, run:
git clone https://github.com/reeyarn/LeanRL/
cd LeanRL
pip install -e .or uv pip install -e ".[dev]"
from leanrl import parse_label_linkbase, Roles
# Get documentation
path = "/tmp/us-gaap-2020-01-31/elts/"
#path = "LeanRL/tests/data/"
filename = "us-gaap-doc-2020-01-31.xml"
docs = parse_label_linkbase(path + filename)
for i, (concept, doc) in enumerate(docs.items()):
print(f"{i}: {concept}: {doc}")
if i > 32:
break
# Get display labels
labels = parse_label_linkbase(path + 'us-gaap-lab-2020-01-31.xml', role=Roles.LABEL)
for i, (concept, label) in enumerate(labels.items()):
print(f"{i}: {concept}: {label}")
if i > 32:
breakSee test2.py and test3.py for latest added features
Output example:
| concept | label | documentation | reference | data_type | is_abstract | period_type | is_monetary | balance | all_statements | all_disclosures | depth | path | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 12 | us-gaap_AdmissionMember | Admission [Member] | Right or permission to enter. Includes, but is not limited to, entrance to park, ride, attraction, theater, sporting event, and movie. | Reference(FASB Accounting Standards Codification Topic 606 Section 55) | nonnum:domainItemType | True | duration | False | soi | 5 | [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > srt_ProductOrServiceAxis > srt_ProductsAndServicesDomain > us-gaap_AdmissionMember | ||
| 120 | us-gaap_AccidentAndHealthInsuranceSegmentMember | Accident and Health Insurance Product Line [Member] | Product line consisting of insurance against loss by illness or injury, including but not limited to medical, dental, disability, workmen's compensation and long-term care. | Reference(FASB Accounting Standards Codification Topic 944 Section 55), Reference(FASB Accounting Standards Codification Topic 944 Section S99), Reference(FASB Accounting Standards Codification Topic 944 Section S99) | nonnum:domainItemType | True | duration | False | soi-ins | dis-sec-reins,dis-fs-insa,dis-fs-ins | 5 | [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > srt_ProductOrServiceAxis > srt_ProductsAndServicesDomain > us-gaap_AccidentAndHealthInsuranceSegmentMember | |
| 121 | us-gaap_AccidentAndHealthInsuranceExcludingWorkersCompensationMember | Accident and Health Insurance, Excluding Workers' Compensation [Member] | Contract providing insurance coverage against loss by illness or injury. Includes, but is not limited to, medical, dental, disability, and long-term care. Excludes workers' compensation. | Reference(FASB Accounting Standards Codification Topic 944 Section 55) | nonnum:domainItemType | True | duration | False | soi-ins | dis-fs-insa,dis-fs-ins | 6 | [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > srt_ProductOrServiceAxis > srt_ProductsAndServicesDomain > us-gaap_AccidentAndHealthInsuranceSegmentMember > us-gaap_AccidentAndHealthInsuranceExcludingWorkersCompensationMember | |
| 347 | us-gaap_AccretionExpenseIncludingAssetRetirementObligations | Accretion Expense, Including Asset Retirement Obligations | Amount of accretion expense, which includes, but is not limited to, accretion expense from asset retirement obligations, environmental remediation obligations, and other contingencies. | xbrli:monetaryItemType | False | duration | True | debit | soi-sbi,soi-egm,soi-re,scf-sbo,scf-indir,scf-dbo,soi-int,scf-re,soi,soi-reit | 13 | [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > us-gaap_StatementLineItems > us-gaap_PartnershipIncomeAbstract > us-gaap_IncomeLossAttributableToParentAbstract > us-gaap_IncomeLossIncludingPortionAttributableToNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesMinorityInterestAndIncomeLossFromEquityMethodInvestmentsAbstract > us-gaap_OperatingIncomeLossAbstract > us-gaap_OperatingExpensesAbstract > us-gaap_OperatingCostsAndExpensesAbstract > us-gaap_AccretionExpenseIncludingAssetRetirementObligationsAbstract > us-gaap_AccretionExpenseIncludingAssetRetirementObligations | ||
| 348 | us-gaap_AccretionExpenseIncludingAssetRetirementObligationsAbstract | Accretion Expense, Including Asset Retirement Obligations [Abstract] | xbrli:stringItemType | True | duration | False | soi-sbi,soi-egm,soi-re,scf-sbo,scf-indir,scf-dbo,soi-int,scf-re,soi,soi-reit | 12 | [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > us-gaap_StatementLineItems > us-gaap_PartnershipIncomeAbstract > us-gaap_IncomeLossAttributableToParentAbstract > us-gaap_IncomeLossIncludingPortionAttributableToNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesMinorityInterestAndIncomeLossFromEquityMethodInvestmentsAbstract > us-gaap_OperatingIncomeLossAbstract > us-gaap_OperatingExpensesAbstract > us-gaap_OperatingCostsAndExpensesAbstract > us-gaap_AccretionExpenseIncludingAssetRetirementObligationsAbstract | ||||
| 350 | us-gaap_AccretionExpense | Accretion Expense | Amount recognized for the passage of time, typically for liabilities, that have been discounted to their net present values. Excludes accretion associated with asset retirement obligations. | Reference(FASB Accounting Standards Codification Topic 410 Section 45), Reference(FASB Accounting Standards Codification Topic 420 Section 35) | xbrli:monetaryItemType | False | duration | True | debit | soi-sbi,soi-egm,soi-re,scf-sbo,scf-indir,scf-dbo,soi-int,scf-re,soi,soi-reit | 13 | [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > us-gaap_StatementLineItems > us-gaap_PartnershipIncomeAbstract > us-gaap_IncomeLossAttributableToParentAbstract > us-gaap_IncomeLossIncludingPortionAttributableToNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesMinorityInterestAndIncomeLossFromEquityMethodInvestmentsAbstract > us-gaap_OperatingIncomeLossAbstract > us-gaap_OperatingExpensesAbstract > us-gaap_OperatingCostsAndExpensesAbstract > us-gaap_AccretionExpenseIncludingAssetRetirementObligationsAbstract > us-gaap_AccretionExpense | |
| 603 | us-gaap_AdministrativeServiceMember | Administrative Service [Member] | Administrative assistance, including, but not limited to, accounting, tax, legal, regulatory filing, and share registration of managed fund and investment account of independent third party, and related and affiliated entity. | Reference(FASB Accounting Standards Codification Topic 606 Section 55) | nonnum:domainItemType | True | duration | False | soi | 8 | [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > srt_ProductOrServiceAxis > srt_ProductsAndServicesDomain > us-gaap_ServiceMember > us-gaap_FinancialServiceMember > us-gaap_InvestmentAdvisoryManagementAndAdministrativeServiceMember > us-gaap_AdministrativeServiceMember | ||
| 624 | us-gaap_AdvertisingExpense | Advertising Expense | Amount charged to advertising expense for the period, which are expenses incurred with the objective of increasing revenue for a specified brand, product or product line. | Reference(FASB Accounting Standards Codification Topic 720 Section 50) | xbrli:monetaryItemType | False | duration | True | debit | soi-ins,soi-sbi,soi-egm,soi-re,soi-int,soi,soi-reit | 14 | [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > us-gaap_StatementLineItems > us-gaap_PartnershipIncomeAbstract > us-gaap_IncomeLossAttributableToParentAbstract > us-gaap_IncomeLossIncludingPortionAttributableToNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesMinorityInterestAndIncomeLossFromEquityMethodInvestmentsAbstract > us-gaap_OperatingIncomeLossAbstract > us-gaap_OperatingExpensesAbstract > us-gaap_SellingGeneralAndAdministrativeExpenseAbstract > us-gaap_SellingAndMarketingExpenseAbstract > us-gaap_MarketingAndAdvertisingExpenseAbstract > us-gaap_AdvertisingExpense | |
| 632 | us-gaap_AdvertisingMember | Advertising [Member] | Announcement promoting product, service, or event. | Reference(FASB Accounting Standards Codification Topic 606 Section 55) | nonnum:domainItemType | True | duration | False | soi | 5 | [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > srt_ProductOrServiceAxis > srt_ProductsAndServicesDomain > us-gaap_AdvertisingMember | ||
| 636 | us-gaap_AffiliateCosts | Affiliate Costs | Costs associated with revenues arising from an entity that is an affiliate of the reporting entity by means of direct or indirect ownership. | Reference(FASB Accounting Standards Codification Topic 220 Section S99), Reference(FASB Accounting Standards Codification Topic 235 Section S99) | xbrli:monetaryItemType | False | duration | True | debit | soi-egm,soi-re,soi,soi-reit | 14 | [Statement of Income] > us-gaap_IncomeStatementAbstract > us-gaap_StatementTable > us-gaap_StatementLineItems > us-gaap_PartnershipIncomeAbstract > us-gaap_IncomeLossAttributableToParentAbstract > us-gaap_IncomeLossIncludingPortionAttributableToNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterestAbstract > us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesMinorityInterestAndIncomeLossFromEquityMethodInvestmentsAbstract > us-gaap_OperatingIncomeLossAbstract > us-gaap_GrossProfitAbstract > us-gaap_CostOfRevenueAbstract > us-gaap_CostOfGoodsAndServicesSoldAbstract > us-gaap_CostOfGoodsAndServiceExcludingDepreciationDepletionAndAmortizationAbstract > us-gaap_AffiliateCosts |
leanrl/
├── src/leanrl/
│ ├── core/
│ │ ├── namespaces.py # qname(), Roles, NS_LINK, etc.
│ │ ├── parser.py
│ │ └── streaming.py # stream_xml()
│ ├── utils/
│ │ └── href.py # extract_concept_from_href()
| linkbases/
│ │ ├── __init__.py
│ │ ├── label.py # Label linkbase only
│ │ ├── reference.py # Reference linkbase only
│ │ ├── calculation.py # Calculation linkbase only
│ │ ├── hierarchy.py # Shared ConceptNode, ConceptTree (used by def & pre)
│ │ ├── definition.py # Definition linkbase only (imports from hierarchy)
│ │ ├── presentation.py # Presentation linkbase only (imports from hierarchy)
│ │ └── README.md # Linkbase documentation
│ ├── taxonomy/
│ │ ├── __init__.py
│ │ └── schema.py # Taxonomy schema parser
│ └── instance/ # (reserved for future instance document parsing)
├── tests/
│ ├── test1.py
│ ├── test2.py
│ └── data/ # Sample taxonomy files
└── docs/
This project supports the European Single Electronic Format (ESEF), established by the European Securities and Markets Authority (ESMA) as the mandated digital reporting standard for annual financial reports of listed companies in the European Union. The ESEF specifications and guidelines are sourced from ESMA’s official publications and are adhered to in this implementation. For more information, visit esma.europa.eu.
This project supports the processing of filings based on the International Financial Reporting Standards (IFRS) and the European Single Electronic Format (ESEF).
IFRS Taxonomy The IFRS Taxonomy is developed and maintained by the IFRS Foundation. The taxonomy files included or referenced in this project are sourced from the IFRS Foundation’s official repository.
- Copyright: The IFRS Taxonomy is Copyright © IFRS Foundation. All rights reserved.
- Disclaimer: This project is an open-source tool and is not affiliated with, endorsed by, or commercially licensed by the IFRS Foundation. The files are used solely to facilitate the technical validation and creation of XBRL/iXBRL documents. For official standards, please visit ifrs.org.
ESEF Guidelines The ESEF reporting standard is established by the European Securities and Markets Authority (ESMA) for listed companies in the European Union.
- Source: ESEF specifications are sourced from ESMA’s official publications.
- Attribution: Adherence to ESEF guidelines in this project is based on public technical standards available at esma.europa.eu.
This project includes copies of the US GAAP Financial Reporting Taxonomy (e.g., us-gaap-YYYY-MM-DD.xsd), sourced from official locations (e.g., fasb.org and xbrl.us). These files are Copyright © Financial Accounting Foundation (FAF) and, for certain prior versions, XBRL US, Inc.
The taxonomy files are redistributed within this project as a "Permitted Work" pursuant to the FAF's Copyright Notice and policies. They are provided for public use to assist in the implementation and processing of XBRL data.
Compliance Conditions:
- Non-Modification: All original copyright notices, XML comments, disclaimers, and license statements embedded in the taxonomy files have been preserved unchanged.
- No Ownership Claim: This project does not claim ownership of the taxonomy; rights remain exclusively with the FAF and XBRL US.
- Authorized Use: Use of these files is subject to the Notice of Authorized Uses maintained by the FAF.
For full license terms, please see the Official Terms and Conditions.
The use of the standards, taxonomies, and schemas listed above is intended to support educational and research purposes in alignment with the open-source goals of this project.
Rights Infringement Contact: If any use herein is found to infringe upon the rights of the FASB, XBRL US, ESMA, or the IFRS Foundation, please contact the author immediately:
Contact: [email protected]
Upon receipt of a valid notice, the author will promptly remove or adjust the offending content to address any concerns.