Am I using the newest version of the library?
Is there an existing issue for this?
Current Behavior
- The v2 implementation doesn't seem to recognize the header option. If it is included or left off, the result is the same and the header is read as the first row of data.
- The v2 implementation doesn't read past the start cell provided.
Expected Behavior
- I expect the header to be recognized when the option is passed as header = True
- I expect the starting cell given to the dataAddress option to return that cell and all rows below and columns to the right
Steps To Reproduce
Read in downloaded file:
issue_example.xlsx
spark.read.format("excel").option("header", "true").load([downloaded file from above])
Attempt to start reading at second row:
spark.read.format("excel").option("header", "true").option("dataAddress", "'Sheet1'!A2").load([downloaded file from above])
Environment
- Spark version: 4.0.0
- Spark-Excel version: 4.0.0_0.31.2
- OS: Azure Databricks Ubuntu 24.04.2 LTS
- Cluster environment: DBR 17.0
Anything else?
The v1 implementation seems to be working as expected.
Am I using the newest version of the library?
Is there an existing issue for this?
Current Behavior
Expected Behavior
Steps To Reproduce
Read in downloaded file:
issue_example.xlsx
spark.read.format("excel").option("header", "true").load([downloaded file from above])Attempt to start reading at second row:
spark.read.format("excel").option("header", "true").option("dataAddress", "'Sheet1'!A2").load([downloaded file from above])Environment
Anything else?
The v1 implementation seems to be working as expected.