Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the master branch of pandas.
Reproducible Example
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# sample data
dates1 = ['2021-08-26', '2021-08-27', '2021-08-30', '2021-08-31',
'2021-09-01', '2021-09-02', '2021-09-03', '2021-09-07',
'2021-09-08', '2021-09-09', '2021-09-10', '2021-09-13',
'2021-09-14', '2021-09-15', '2021-09-16', '2021-09-17',
'2021-09-20', '2021-09-21', '2021-09-22', '2021-09-23',
'2021-09-24', '2021-09-27', '2021-09-28', '2021-09-29',
'2021-09-30', '2021-10-01', '2021-10-04', '2021-10-05',
'2021-10-06', '2021-10-07', '2021-10-08']
dates2 = ['2021-08-29', '2021-09-05', '2021-09-12', '2021-09-19', '2021-09-26']
np.random.seed(365)
y1 = np.random.randn(len(dates1)).cumsum()
y2 = np.random.randn(len(dates2)).cumsum()
# dataframe with more than a month span
df1 = pd.DataFrame({'date':pd.to_datetime(dates1), 'y1':y1})
df1.set_index('date', inplace=True)
# dataframe with less than a month span
df2 = pd.DataFrame({'date':pd.to_datetime(dates2), 'y2':y2})
df2.set_index('date', inplace=True)
Issue Description
- See SO: Plotting two pandas time-series on the same axes with matplotlib - unexpected behavior
- Using
pandas.DataFrame.plot
to plot data with a date range span of more than one month with another data set where the date range span is less than a month, onseconday_y
, produces unexpected results in how the API formats and plots thexticks
, which results in an incorrect visualization in subplot 0. - In both cases, it is not clear what format the dates have been converted to for plotting.
- If
dates2
spans at least a month, the issue doesn't occur. (e.g.dates2 = ['2021-08-29', '2021-09-05', '2021-09-12', '2021-09-19', '2021-09-26', '2021-09-29']
).
fig, axs = plt.subplots(2, 2, figsize=[12, 12])
axs = axs.flat
print('Note the difference in xticks depending on the date span')
df1.plot(ax=axs[0], title='x-axis is incorrect when the dataframe with\nmore than a month of dates is plotted first')
print(f'axs[0]: {axs[0].get_xticks()}')
df2.plot(ax=axs[0], secondary_y=True)
print(f'axs[0]: {axs[0].get_xticks()}')
df2.plot(ax=axs[1], color='tab:orange', title='x-axis is correct when the dataframe with\nless than a month of dates is plotted first')
print(f'axs[1]: {axs[1].get_xticks()}')
df1.plot(ax=axs[1], color='tab:blue', secondary_y=True)
print(f'axs[1]: {axs[1].get_xticks()}')
df1.y1.plot(ax=axs[2], color='tab:blue', title='More than a month of data')
print(f'axs[2]: {axs[2].get_xticks()}')
df2.y2.plot(ax=axs[3], color='tab:orange', title='Less than a month of data')
print(f'axs[3]: {axs[3].get_xticks()}')
plt.tight_layout()
- Printed output
Note the difference in xticks depending on the date span
axs[0]: [18871. 18878. 18885. 18892. 18901. 18908.]
axs[0]: [ 2696 4175 6784 9393 12002 14611 17220 18908]
axs[1]: [2696 2697 2700]
axs[1]: [2696 2697 2701 2702]
axs[2]: [18871. 18878. 18885. 18892. 18901. 18908.]
axs[3]: [2696 2697 2700]
Expected Behavior
- Plotting directly with
matplotlib.pyplot.plot
produces the correct result
fig, axs = plt.subplots(2, 2, figsize=[20, 8], sharey=False, sharex=False)
axs = axs.flatten()
axs[0].plot(df1.index, df1.y1, marker='.', color='tab:blue')
print(f'axs[0]: {axs[0].get_xticks()}')
ax4 = axs[0].twinx()
ax4.plot(df2.index, df2.y2, marker='.', color='tab:orange')
print(f'ax4: {ax4.get_xticks()}')
axs[1].plot(df2.index, df2.y2, marker='.', color='tab:orange')
print(f'axs[1]: {axs[1].get_xticks()}')
ax5 = axs[1].twinx()
ax5.plot(df1.index, df1.y1, marker='.', color='tab:blue')
print(f'ax5: {ax5.get_xticks()}')
axs[2].plot(df1.index, df1.y1, marker='.', color='tab:blue')
print(f'axs[2]: {axs[2].get_xticks()}')
axs[3].plot(df2.index, df2.y2, marker='.', color='tab:orange')
print(f'axs[3]: {axs[3].get_xticks()}')
- Print output
axs[0]: [18871. 18878. 18885. 18892. 18901. 18908.]
ax4: [18871. 18878. 18885. 18892. 18901. 18908.]
axs[1]: [18868. 18871. 18875. 18879. 18883. 18887. 18891. 18895.]
ax5: [18871. 18878. 18885. 18892. 18901. 18908.]
axs[2]: [18871. 18878. 18885. 18892. 18901. 18908.]
axs[3]: [18868. 18871. 18875. 18879. 18883. 18887. 18891. 18895.]
Installed Versions
INSTALLED VERSIONS
commit : 73c6825
python : 3.8.11.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19043
machine : AMD64
processor : Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252
pandas : 1.3.3
numpy : 1.20.3
pytz : 2021.3
dateutil : 2.8.2
pip : 21.0.1
setuptools : 58.0.4
Cython : 0.29.24
pytest : 6.2.4
hypothesis : None
sphinx : 4.2.0
blosc : None
feather : None
xlsxwriter : 3.0.1
lxml.etree : 4.6.3
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.3
IPython : 7.27.0
pandas_datareader: 0.10.0
bs4 : 4.10.0
bottleneck : 1.3.2
fsspec : 2021.08.1
fastparquet : None
gcsfs : None
matplotlib : 3.4.3
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.7.1
sqlalchemy : 1.4.22
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 2.0.1
xlwt : 1.3.0
numba : 0.53.1