Skip to content

Commit 08905be

Browse files
committed
improve README.rst and docs
1 parent a18cc09 commit 08905be

9 files changed

+687
-466
lines changed

.travis.yml

+2-2
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ matrix:
1313
- python: 3.7
1414
env: TOXENV=py37
1515
- python: 3.8
16-
env: TOXENV=py38
16+
env: TOXENV=py38 PYPI_RELEASE_JOB=true
1717
- python: 3.9-dev
1818
env: TOXENV=py39
1919

@@ -41,4 +41,4 @@ deploy:
4141
on:
4242
tags: true
4343
repo: scrapinghub/dateparser
44-
condition: "$TOXENV == py27"
44+
condition: "$PYPI_RELEASE_JOB == true

README.rst

+164-443
Large diffs are not rendered by default.

docs/conf.py

+15-3
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929

3030
# Add any Sphinx extension module names here, as strings. They can be
3131
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
32-
extensions = ['sphinx.ext.autodoc', 'sphinx.ext.viewcode', 'sphinx.ext.intersphinx']
32+
extensions = ['sphinx.ext.autodoc', 'sphinx.ext.viewcode', 'sphinx.ext.intersphinx', 'sphinx_rtd_theme']
3333

3434
# Add any paths that contain templates here, relative to this directory.
3535
templates_path = ['_templates']
@@ -65,7 +65,7 @@
6565

6666
# The theme to use for HTML and HTML Help pages. See the documentation for
6767
# a list of builtin themes.
68-
html_theme = 'default'
68+
html_theme = 'sphinx_rtd_theme'
6969

7070
# Add any paths that contain custom static files (such as style sheets)
7171
# here, relative to this directory. They are copied after the builtin
@@ -117,4 +117,16 @@
117117
]
118118

119119
# sphinx.ext.intersphinx confs
120-
intersphinx_mapping = {'python': ('https://docs.python.org/2', None)}
120+
intersphinx_mapping = {'python': ('https://docs.python.org/3', None)}
121+
122+
123+
html_theme_options = {
124+
'logo_only': True,
125+
'collapse_navigation': True,
126+
'sticky_navigation': True,
127+
'navigation_depth': 4,
128+
'includehidden': True,
129+
'titles_only': False
130+
}
131+
132+
html_logo = "../artwork/dateparser-logo.png"

docs/contributing.rst

+1
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
1+
.. _contributing:
12
.. include:: ../CONTRIBUTING.rst

docs/index.rst

+40-8
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,58 @@
1-
.. dateparser documentation master file, created by
2-
sphinx-quickstart on Tue Jul 9 22:26:36 2013.
3-
You can adapt this file completely to your liking, but it should at least
4-
contain the root `toctree` directive.
1+
====================================================
2+
dateparser -- python parser for human readable dates
3+
====================================================
4+
5+
.. image:: https://img.shields.io/pypi/dm/dateparser
6+
:target: https://pypi.python.org/pypi/dateparser
7+
:alt: pypi downloads
8+
9+
.. image:: https://img.shields.io/pypi/v/dateparser.svg
10+
:target: https://pypi.python.org/pypi/dateparser
11+
:alt: pypi version
12+
13+
.. image:: https://codecov.io/gh/scrapinghub/dateparser/branch/master/graph/badge.svg
14+
:target: https://codecov.io/gh/scrapinghub/dateparser
15+
:alt: Code Coverage
16+
17+
.. image:: https://img.shields.io/travis/scrapinghub/dateparser/master.svg
18+
:target: https://travis-ci.org/scrapinghub/dateparser
19+
:alt: travis build status
20+
21+
.. image:: https://readthedocs.org/projects/dateparser/badge/?version=latest
22+
:target: http://dateparser.readthedocs.org/en/latest/?badge=latest
23+
:alt: Documentation Status
24+
25+
26+
`dateparser` provides modules to easily parse localized dates in almost
27+
any string formats commonly found on web pages.
528

6-
.. include:: ../README.rst
7-
.. include:: usage.rst
829

930
Documentation
1031
=============
1132

33+
This documentation is built automatically and can be found on
34+
`Read the Docs <https://dateparser.readthedocs.org/en/latest/>`_.
35+
36+
37+
.. include:: introduction.rst
38+
39+
Indices and tables
40+
==================
41+
42+
1243
Contents:
1344

1445
.. toctree::
1546
:maxdepth: 2
1647

48+
introduction
1749
installation
50+
usage
51+
supported_locales
1852
contributing
1953
authors
2054
history
2155

22-
Indices and tables
23-
==================
2456

2557
* :ref:`genindex`
2658
* :ref:`modindex`

docs/introduction.rst

+246
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,246 @@
1+
==========================
2+
Introduction to dateparser
3+
==========================
4+
5+
6+
Features
7+
========
8+
9+
* Generic parsing of dates in over 200 language locales plus numerous formats in a language agnostic fashion.
10+
* Generic parsing of relative dates like: ``'1 min ago'``, ``'2 weeks ago'``, ``'3 months, 1 week and 1 day ago'``, ``'in 2 days'``, ``'tomorrow'``.
11+
* Generic parsing of dates with time zones abbreviations or UTC offsets like: ``'August 14, 2015 EST'``, ``'July 4, 2013 PST'``, ``'21 July 2013 10:15 pm +0500'``.
12+
* Date lookup in longer texts.
13+
* Support for non-Gregorian calendar systems. See `Supported Calendars`_.
14+
* Extensive test coverage.
15+
16+
17+
Basic Usage
18+
===========
19+
20+
The most straightforward way is to use the `dateparser.parse <#dateparser.parse>`_ function,
21+
that wraps around most of the functionality in the module.
22+
23+
.. automodule:: dateparser
24+
:members: parse
25+
26+
27+
Popular Formats
28+
---------------
29+
30+
>>> import dateparser
31+
>>> dateparser.parse('12/12/12')
32+
datetime.datetime(2012, 12, 12, 0, 0)
33+
>>> dateparser.parse('Fri, 12 Dec 2014 10:55:50')
34+
datetime.datetime(2014, 12, 12, 10, 55, 50)
35+
>>> dateparser.parse('Martes 21 de Octubre de 2014') # Spanish (Tuesday 21 October 2014)
36+
datetime.datetime(2014, 10, 21, 0, 0)
37+
>>> dateparser.parse('Le 11 Décembre 2014 à 09:00') # French (11 December 2014 at 09:00)
38+
datetime.datetime(2014, 12, 11, 9, 0)
39+
>>> dateparser.parse('13 января 2015 г. в 13:34') # Russian (13 January 2015 at 13:34)
40+
datetime.datetime(2015, 1, 13, 13, 34)
41+
>>> dateparser.parse('1 เดือนตุลาคม 2005, 1:00 AM') # Thai (1 October 2005, 1:00 AM)
42+
datetime.datetime(2005, 10, 1, 1, 0)
43+
44+
This will try to parse a date from the given string, attempting to
45+
detect the language each time.
46+
47+
You can specify the language(s), if known, using ``languages`` argument. In this case, given languages are used and language detection is skipped:
48+
49+
>>> dateparser.parse('2015, Ago 15, 1:08 pm', languages=['pt', 'es'])
50+
datetime.datetime(2015, 8, 15, 13, 8)
51+
52+
If you know the possible formats of the dates, you can
53+
use the ``date_formats`` argument:
54+
55+
>>> dateparser.parse('22 Décembre 2010', date_formats=['%d %B %Y'])
56+
datetime.datetime(2010, 12, 22, 0, 0)
57+
58+
59+
Relative Dates
60+
--------------
61+
62+
>>> parse('1 hour ago')
63+
datetime.datetime(2015, 5, 31, 23, 0)
64+
>>> parse('Il ya 2 heures') # French (2 hours ago)
65+
datetime.datetime(2015, 5, 31, 22, 0)
66+
>>> parse('1 anno 2 mesi') # Italian (1 year 2 months)
67+
datetime.datetime(2014, 4, 1, 0, 0)
68+
>>> parse('yaklaşık 23 saat önce') # Turkish (23 hours ago)
69+
datetime.datetime(2015, 5, 31, 1, 0)
70+
>>> parse('Hace una semana') # Spanish (a week ago)
71+
datetime.datetime(2015, 5, 25, 0, 0)
72+
>>> parse('2小时前') # Chinese (2 hours ago)
73+
datetime.datetime(2015, 5, 31, 22, 0)
74+
75+
.. note:: Testing above code might return different values for you depending on your environment's current date and time.
76+
77+
.. note:: Support for relative dates in future needs a lot of improvement, we look forward to community's contribution to get better on that part. See ":ref:`contributing`".
78+
79+
80+
OOTB Language Based Date Order Preference
81+
-----------------------------------------
82+
83+
>>> # parsing ambiguous date
84+
>>> parse('02-03-2016') # assumes english language, uses MDY date order
85+
datetime.datetime(2016, 2, 3, 0, 0)
86+
>>> parse('le 02-03-2016') # detects french, uses DMY date order
87+
datetime.datetime(2016, 3, 2, 0, 0)
88+
89+
.. note:: Ordering is not locale based, that's why do not expect `DMY` order for UK/Australia English. You can specify date order in that case as follows using `settings`:
90+
91+
>>> parse('18-12-15 06:00', settings={'DATE_ORDER': 'DMY'})
92+
datetime.datetime(2015, 12, 18, 6, 0)
93+
94+
For more on date order, please look at Settings.
95+
96+
97+
98+
Timezone and UTC Offset
99+
-----------------------
100+
101+
By default, `dateparser` returns tzaware `datetime` if timezone is present in date string. Otherwise, it returns a naive `datetime` object.
102+
103+
>>> parse('January 12, 2012 10:00 PM EST')
104+
datetime.datetime(2012, 1, 12, 22, 0, tzinfo=<StaticTzInfo 'EST'>)
105+
106+
>>> parse('January 12, 2012 10:00 PM -0500')
107+
datetime.datetime(2012, 1, 12, 22, 0, tzinfo=<StaticTzInfo 'UTC\-05:00'>)
108+
109+
>>> parse('2 hours ago EST')
110+
datetime.datetime(2017, 3, 10, 15, 55, 39, 579667, tzinfo=<StaticTzInfo 'EST'>)
111+
112+
>>> parse('2 hours ago -0500')
113+
datetime.datetime(2017, 3, 10, 15, 59, 30, 193431, tzinfo=<StaticTzInfo 'UTC\-05:00'>)
114+
115+
If date has no timezone name/abbreviation or offset, you can specify it using `TIMEZONE` setting.
116+
117+
>>> parse('January 12, 2012 10:00 PM', settings={'TIMEZONE': 'US/Eastern'})
118+
datetime.datetime(2012, 1, 12, 22, 0)
119+
120+
>>> parse('January 12, 2012 10:00 PM', settings={'TIMEZONE': '+0500'})
121+
datetime.datetime(2012, 1, 12, 22, 0)
122+
123+
`TIMEZONE` option may not be useful alone as it only attaches given timezone to
124+
resultant `datetime` object. But can be useful in cases where you want conversions from and to different
125+
timezones or when simply want a tzaware date with given timezone info attached.
126+
127+
>>> parse('January 12, 2012 10:00 PM', settings={'TIMEZONE': 'US/Eastern', 'RETURN_AS_TIMEZONE_AWARE': True})
128+
datetime.datetime(2012, 1, 12, 22, 0, tzinfo=<DstTzInfo 'US/Eastern' EST-1 day, 19:00:00 STD>)
129+
130+
131+
>>> parse('10:00 am', settings={'TIMEZONE': 'EST', 'TO_TIMEZONE': 'EDT'})
132+
datetime.datetime(2016, 9, 25, 11, 0)
133+
134+
Some more use cases for conversion of timezones.
135+
136+
>>> parse('10:00 am EST', settings={'TO_TIMEZONE': 'EDT'}) # date string has timezone info
137+
datetime.datetime(2017, 3, 12, 11, 0, tzinfo=<StaticTzInfo 'EDT'>)
138+
139+
>>> parse('now EST', settings={'TO_TIMEZONE': 'UTC'}) # relative dates
140+
datetime.datetime(2017, 3, 10, 23, 24, 47, 371823, tzinfo=<StaticTzInfo 'UTC'>)
141+
142+
In case, no timezone is present in date string or defined in `settings`. You can still
143+
return tzaware `datetime`. It is especially useful in case of relative dates when uncertain
144+
what timezone is relative base.
145+
146+
>>> parse('2 minutes ago', settings={'RETURN_AS_TIMEZONE_AWARE': True})
147+
datetime.datetime(2017, 3, 11, 4, 25, 24, 152670, tzinfo=<DstTzInfo 'Asia/Karachi' PKT+5:00:00 STD>)
148+
149+
In case, you want to compute relative dates in UTC instead of default system's local timezone, you can use `TIMEZONE` setting.
150+
151+
>>> parse('4 minutes ago', settings={'TIMEZONE': 'UTC'})
152+
datetime.datetime(2017, 3, 10, 23, 27, 59, 647248, tzinfo=<StaticTzInfo 'UTC'>)
153+
154+
.. note:: In case, when timezone is present both in string and also specified using `settings`, string is parsed into tzaware representation and then converted to timezone specified in `settings`.
155+
156+
>>> parse('10:40 pm PKT', settings={'TIMEZONE': 'UTC'})
157+
datetime.datetime(2017, 3, 12, 17, 40, tzinfo=<StaticTzInfo 'UTC'>)
158+
159+
>>> parse('20 mins ago EST', settings={'TIMEZONE': 'UTC'})
160+
datetime.datetime(2017, 3, 12, 21, 16, 0, 885091, tzinfo=<StaticTzInfo 'UTC'>)
161+
162+
For more on timezones, please look at Settings.
163+
164+
165+
Incomplete Dates
166+
----------------
167+
168+
>>> from dateparser import parse
169+
>>> parse('December 2015') # default behavior
170+
datetime.datetime(2015, 12, 16, 0, 0)
171+
>>> parse('December 2015', settings={'PREFER_DAY_OF_MONTH': 'last'})
172+
datetime.datetime(2015, 12, 31, 0, 0)
173+
>>> parse('December 2015', settings={'PREFER_DAY_OF_MONTH': 'first'})
174+
datetime.datetime(2015, 12, 1, 0, 0)
175+
176+
>>> parse('March')
177+
datetime.datetime(2015, 3, 16, 0, 0)
178+
>>> parse('March', settings={'PREFER_DATES_FROM': 'future'})
179+
datetime.datetime(2016, 3, 16, 0, 0)
180+
>>> # parsing with preference set for 'past'
181+
>>> parse('August', settings={'PREFER_DATES_FROM': 'past'})
182+
datetime.datetime(2015, 8, 15, 0, 0)
183+
184+
You can also ignore parsing incomplete dates altogether by setting `STRICT_PARSING` flag as follows:
185+
186+
>>> parse('December 2015', settings={'STRICT_PARSING': True})
187+
None
188+
189+
For more on handling incomplete dates, please look at Settings.
190+
191+
192+
Search for Dates in Longer Chunks of Text
193+
-----------------------------------------
194+
195+
You can extract dates from longer strings of text. They are returned as list of tuples with text chunk containing the date and parsed datetime object.
196+
197+
.. automodule:: dateparser.search
198+
:members: search_dates
199+
200+
Dependencies
201+
============
202+
203+
`dateparser` relies on following libraries in some ways:
204+
205+
* dateutil_'s module ``relativedelta`` for its freshness parser.
206+
* convertdate_ to convert *Jalali* dates to *Gregorian*.
207+
* hijri-converter_ to convert *Hijri* dates to *Gregorian*.
208+
* tzlocal_ to reliably get local timezone.
209+
* ruamel.yaml_ (optional) for operations on language files.
210+
211+
.. _dateutil: https://pypi.python.org/pypi/python-dateutil
212+
.. _convertdate: https://pypi.python.org/pypi/convertdate
213+
.. _hijri-converter: https://pypi.python.org/pypi/hijri-converter
214+
.. _tzlocal: https://pypi.python.org/pypi/tzlocal
215+
.. _ruamel.yaml: https://pypi.python.org/pypi/ruamel.yaml
216+
217+
Supported languages and locales
218+
===============================
219+
You can check the supported locales by visiting the ":ref:`supported-locales`" section.
220+
221+
222+
Supported Calendars
223+
===================
224+
* Gregorian calendar.
225+
226+
* Persian Jalali calendar. For more information, refer to `Persian Jalali Calendar <https://en.wikipedia.org/wiki/Iranian_calendars#Zoroastrian_calendar>`_.
227+
228+
>>> from dateparser.calendars.jalali import JalaliCalendar
229+
>>> JalaliCalendar('جمعه سی ام اسفند ۱۳۸۷').get_date()
230+
{'date_obj': datetime.datetime(2009, 3, 20, 0, 0), 'period': 'day'}
231+
232+
233+
* Hijri/Islamic Calendar. For more information, refer to `Hijri Calendar <https://en.wikipedia.org/wiki/Islamic_calendar>`_.
234+
235+
>>> from dateparser.calendars.hijri import HijriCalendar
236+
>>> HijriCalendar('17-01-1437 هـ 08:30 مساءً').get_date()
237+
{'date_obj': datetime.datetime(2015, 10, 30, 20, 30), 'period': 'day'}
238+
239+
.. note:: `HijriCalendar` only works with Python ≥ 3.6.
240+
.. note:: For `Finnish` language, please specify `settings={'SKIP_TOKENS': []}` to correctly parse freshness dates.
241+
242+
243+
Install using following command to use calendars.
244+
245+
.. tip::
246+
pip install dateparser[calendars]

0 commit comments

Comments
 (0)