Skip to content

Commit b16c042

Browse files
committed
- Fixes to general readme.
- Improvements to git pre-commit hook. - Split crop function into two overloaded versions, one receives a vector of nodes and the other receives an xpath string. - The crop function base_xpath parameter can now be any expression, even relative to Coords node.
1 parent 3f5ea81 commit b16c042

File tree

16 files changed

+154
-82
lines changed

16 files changed

+154
-82
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,9 @@ Check [py-pagexml/README.rst](py-pagexml/README.rst) and/or [docker/Dockerfile_b
1010

1111
# Contents
1212

13-
- [lib](lib): Directory containing the C++ PageXML library.
13+
- [lib](lib): Directory containing the C++ PageXML and TextFeatExtractor libraries.
1414
- [py-pagexml](py-pagexml): Swig-based python wrapper for the PageXML library.
15-
- [py-textfeat](py-textfeat): Swig-based python wrapper for the PageXML library.
15+
- [py-textfeat](py-textfeat): Swig-based python wrapper for the TextFeatExtractor library.
1616

1717
# Documentation
1818

docs/py-pagexml/_modules/index.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
<meta name="viewport" content="width=device-width, initial-scale=1.0">
1010

11-
<title>Overview: module code &mdash; pagexml 2019.4.23 documentation</title>
11+
<title>Overview: module code &mdash; pagexml 2019.4.26 documentation</title>
1212

1313

1414

@@ -58,7 +58,7 @@
5858

5959

6060
<div class="version">
61-
2019.4.23
61+
2019.4.26
6262
</div>
6363

6464

@@ -83,7 +83,7 @@
8383

8484

8585
<ul>
86-
<li class="toctree-l1"><a class="reference internal" href="../pagexml.html">pagexml API (version 2019.4.23)</a></li>
86+
<li class="toctree-l1"><a class="reference internal" href="../pagexml.html">pagexml API (version 2019.4.26)</a></li>
8787
<li class="toctree-l1"><a class="reference external" href="https://omni-us.github.io/pageformat/pagecontent_omnius.html">OPF XSD schema documentation</a></li>
8888
</ul>
8989

docs/py-pagexml/_modules/pagexml.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
<meta name="viewport" content="width=device-width, initial-scale=1.0">
1010

11-
<title>pagexml &mdash; pagexml 2019.4.23 documentation</title>
11+
<title>pagexml &mdash; pagexml 2019.4.26 documentation</title>
1212

1313

1414

@@ -58,7 +58,7 @@
5858

5959

6060
<div class="version">
61-
2019.4.23
61+
2019.4.26
6262
</div>
6363

6464

@@ -83,7 +83,7 @@
8383

8484

8585
<ul>
86-
<li class="toctree-l1"><a class="reference internal" href="../pagexml.html">pagexml API (version 2019.4.23)</a></li>
86+
<li class="toctree-l1"><a class="reference internal" href="../pagexml.html">pagexml API (version 2019.4.26)</a></li>
8787
<li class="toctree-l1"><a class="reference external" href="https://omni-us.github.io/pageformat/pagecontent_omnius.html">OPF XSD schema documentation</a></li>
8888
</ul>
8989

docs/py-pagexml/_modules/pagexml/pagexml.html

Lines changed: 32 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
<meta name="viewport" content="width=device-width, initial-scale=1.0">
1010

11-
<title>pagexml &mdash; pagexml 2019.4.23 documentation</title>
11+
<title>pagexml &mdash; pagexml 2019.4.26 documentation</title>
1212

1313

1414

@@ -58,7 +58,7 @@
5858

5959

6060
<div class="version">
61-
2019.4.23
61+
2019.4.26
6262
</div>
6363

6464

@@ -83,7 +83,7 @@
8383

8484

8585
<ul>
86-
<li class="toctree-l1"><a class="reference internal" href="../../pagexml.html">pagexml API (version 2019.4.23)</a></li>
86+
<li class="toctree-l1"><a class="reference internal" href="../../pagexml.html">pagexml API (version 2019.4.26)</a></li>
8787
<li class="toctree-l1"><a class="reference external" href="https://omni-us.github.io/pageformat/pagecontent_omnius.html">OPF XSD schema documentation</a></li>
8888
</ul>
8989

@@ -10494,24 +10494,46 @@ <h1>Source code for pagexml</h1><div class="highlight"><pre>
1049410494
<span class="k">return</span> <span class="n">_pagexml</span><span class="o">.</span><span class="n">PageXML_getNodeName</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">node</span><span class="p">,</span> <span class="n">base_node</span><span class="p">)</span></div>
1049510495

1049610496

10497-
<div class="viewcode-block" id="PageXML.crop"><a class="viewcode-back" href="../../pagexml.html#pagexml.PageXML.crop">[docs]</a> <span class="k">def</span> <span class="nf">crop</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">xpath</span><span class="p">,</span> <span class="n">margin</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">opaque_coords</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">transp_xpath</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">base_xpath</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
10497+
<div class="viewcode-block" id="PageXML.crop"><a class="viewcode-back" href="../../pagexml.html#pagexml.PageXML.crop">[docs]</a> <span class="k">def</span> <span class="nf">crop</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">):</span>
1049810498
<span class="sd">&quot;&quot;&quot;</span>
1049910499

10500+
<span class="sd"> Overloaded function with 2 signatures.</span>
10501+
10502+
<span class="sd"> **Signature 1**</span>
10503+
10504+
<span class="sd"> ``std::vector&lt; NamedImage &gt; PageXML::crop(const char *xpath, cv::Point2f *margin=NULL, bool opaque_coords=true, const char *transp_xpath=NULL, const char *base_xpath=NULL)``</span>
10505+
10506+
<span class="sd"> Crops images using its Coords polygon, regions outside the polygon are set to transparent.</span>
10507+
10508+
<span class="sd"> Arguments:</span>
10509+
<span class="sd"> xpath (const char *): Selector for Coord nodes to crop.</span>
10510+
<span class="sd"> margin (cv::Point2f *): Margins, if &gt;1.0 it is considered pixels, otherwise percentage of maximum between crop width and height.</span>
10511+
<span class="sd"> opaque_coords (bool): Whether to include an alpha channel with the polygon interior in opaque.</span>
10512+
<span class="sd"> transp_xpath (const char *): Selector for semi-transparent elements.</span>
10513+
<span class="sd"> base_xpath (const char *): Expression to construct sample name, overriding the default IMGBASE.ELEMID.</span>
10514+
10515+
<span class="sd"> Returns:</span>
10516+
<span class="sd"> std::vector&lt; NamedImage &gt;: An std::vector containing NamedImage objects of the cropped images.</span>
10517+
10518+
<span class="sd"> **Signature 2**</span>
10519+
10520+
<span class="sd"> ``std::vector&lt; NamedImage &gt; PageXML::crop(std::vector&lt; xmlNodePt &gt; elems_coords, cv::Point2f *margin=NULL, bool opaque_coords=true, const char *transp_xpath=NULL, const char *base_xpath=NULL)``</span>
10521+
1050010522
<span class="sd"> Crops images using its Coords polygon, regions outside the polygon are set to transparent.</span>
1050110523

1050210524
<span class="sd"> Arguments:</span>
10503-
<span class="sd"> xpath (const char *): Selector for polygons to crop.</span>
10504-
<span class="sd"> margin (cv::Point2f *): Margins, if &gt;1.0 pixels, otherwise percentage of maximum of crop width and height.</span>
10525+
<span class="sd"> elems_coords (std::vector&lt; xmlNodePt &gt;): Vector of Coord nodes to crop.</span>
10526+
<span class="sd"> margin (cv::Point2f *): Margins, if &gt;1.0 it is considered pixels, otherwise percentage of maximum between crop width and height.</span>
1050510527
<span class="sd"> opaque_coords (bool): Whether to include an alpha channel with the polygon interior in opaque.</span>
1050610528
<span class="sd"> transp_xpath (const char *): Selector for semi-transparent elements.</span>
10507-
<span class="sd"> base_xpath (const char *): Selector for base node to use to construct the sample name.</span>
10529+
<span class="sd"> base_xpath (const char *): Expression to construct sample name, overriding the default IMGBASE.ELEMID.</span>
1050810530

1050910531
<span class="sd"> Returns:</span>
1051010532
<span class="sd"> std::vector&lt; NamedImage &gt;: An std::vector containing NamedImage objects of the cropped images.</span>
1051110533

1051210534

1051310535
<span class="sd"> &quot;&quot;&quot;</span>
10514-
<span class="k">return</span> <span class="n">_pagexml</span><span class="o">.</span><span class="n">PageXML_crop</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">xpath</span><span class="p">,</span> <span class="n">margin</span><span class="p">,</span> <span class="n">opaque_coords</span><span class="p">,</span> <span class="n">transp_xpath</span><span class="p">,</span> <span class="n">base_xpath</span><span class="p">)</span></div>
10536+
<span class="k">return</span> <span class="n">_pagexml</span><span class="o">.</span><span class="n">PageXML_crop</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">)</span></div>
1051510537

1051610538

1051710539
<div class="viewcode-block" id="PageXML.stringToPoints"><a class="viewcode-back" href="../../pagexml.html#pagexml.PageXML.stringToPoints">[docs]</a> <span class="k">def</span> <span class="nf">stringToPoints</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">):</span>
@@ -11148,6 +11170,7 @@ <h1>Source code for pagexml</h1><div class="highlight"><pre>
1114811170
<span class="sd"> elem (xmlNodePt): Element to move.</span>
1114911171
<span class="sd"> node (const xmlNodePt): Reference element for insertion.</span>
1115011172
<span class="sd"> itype (PAGEXML_INSERT): Type of insertion.</span>
11173+
<span class="sd"> bugimpl (bool): </span>
1115111174

1115211175
<span class="sd"> Returns:</span>
1115311176
<span class="sd"> xmlNodePt: Pointer to moved element.</span>
@@ -11166,6 +11189,7 @@ <h1>Source code for pagexml</h1><div class="highlight"><pre>
1116611189
<span class="sd"> elems (const std::vector&lt; xmlNodePt &gt; &amp;): Elements to move.</span>
1116711190
<span class="sd"> node (const xmlNodePt): Reference element for insertion.</span>
1116811191
<span class="sd"> itype (PAGEXML_INSERT): Type of insertion.</span>
11192+
<span class="sd"> bugimpl (bool): </span>
1116911193

1117011194
<span class="sd"> Returns:</span>
1117111195
<span class="sd"> int: Pointer to moved element.</span>

docs/py-pagexml/_static/documentation_options.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
var DOCUMENTATION_OPTIONS = {
22
URL_ROOT: document.getElementById("documentation_options").getAttribute('data-url_root'),
3-
VERSION: '2019.4.23',
3+
VERSION: '2019.4.26',
44
LANGUAGE: 'None',
55
COLLAPSE_INDEX: false,
66
FILE_SUFFIX: '.html',

docs/py-pagexml/genindex.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99

1010
<meta name="viewport" content="width=device-width, initial-scale=1.0">
1111

12-
<title>Index &mdash; pagexml 2019.4.23 documentation</title>
12+
<title>Index &mdash; pagexml 2019.4.26 documentation</title>
1313

1414

1515

@@ -59,7 +59,7 @@
5959

6060

6161
<div class="version">
62-
2019.4.23
62+
2019.4.26
6363
</div>
6464

6565

@@ -84,7 +84,7 @@
8484

8585

8686
<ul>
87-
<li class="toctree-l1"><a class="reference internal" href="pagexml.html">pagexml API (version 2019.4.23)</a></li>
87+
<li class="toctree-l1"><a class="reference internal" href="pagexml.html">pagexml API (version 2019.4.26)</a></li>
8888
<li class="toctree-l1"><a class="reference external" href="https://omni-us.github.io/pageformat/pagecontent_omnius.html">OPF XSD schema documentation</a></li>
8989
</ul>
9090

docs/py-pagexml/index.html

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
<meta name="viewport" content="width=device-width, initial-scale=1.0">
1010

11-
<title>py-pagexml: Python wrapper for the PageXML C++ library &mdash; pagexml 2019.4.23 documentation</title>
11+
<title>py-pagexml: Python wrapper for the PageXML C++ library &mdash; pagexml 2019.4.26 documentation</title>
1212

1313

1414

@@ -35,7 +35,7 @@
3535
<link rel="stylesheet" href="_static/pygments.css" type="text/css" />
3636
<link rel="index" title="Index" href="genindex.html" />
3737
<link rel="search" title="Search" href="search.html" />
38-
<link rel="next" title="pagexml API (version 2019.4.23)" href="pagexml.html" />
38+
<link rel="next" title="pagexml API (version 2019.4.26)" href="pagexml.html" />
3939
</head>
4040

4141
<body class="wy-body-for-nav">
@@ -59,7 +59,7 @@
5959

6060

6161
<div class="version">
62-
2019.4.23
62+
2019.4.26
6363
</div>
6464

6565

@@ -84,7 +84,7 @@
8484

8585

8686
<ul>
87-
<li class="toctree-l1"><a class="reference internal" href="pagexml.html">pagexml API (version 2019.4.23)</a></li>
87+
<li class="toctree-l1"><a class="reference internal" href="pagexml.html">pagexml API (version 2019.4.26)</a></li>
8888
<li class="toctree-l1"><a class="reference external" href="https://omni-us.github.io/pageformat/pagecontent_omnius.html">OPF XSD schema documentation</a></li>
8989
</ul>
9090

@@ -302,7 +302,7 @@ <h3>Crop an element and save image to disk<a class="headerlink" href="#crop-an-e
302302
<h2>Documentation Contents<a class="headerlink" href="#documentation-contents" title="Permalink to this headline"></a></h2>
303303
<div class="toctree-wrapper compound">
304304
<ul>
305-
<li class="toctree-l1"><a class="reference internal" href="pagexml.html">pagexml API (version 2019.4.23)</a><ul>
305+
<li class="toctree-l1"><a class="reference internal" href="pagexml.html">pagexml API (version 2019.4.26)</a><ul>
306306
<li class="toctree-l2"><a class="reference internal" href="pagexml.html#module-pagexml">pagexml module</a></li>
307307
</ul>
308308
</li>
@@ -327,7 +327,7 @@ <h1>Indices and tables<a class="headerlink" href="#indices-and-tables" title="Pe
327327

328328
<div class="rst-footer-buttons" role="navigation" aria-label="footer navigation">
329329

330-
<a href="pagexml.html" class="btn btn-neutral float-right" title="pagexml API (version 2019.4.23)" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right"></span></a>
330+
<a href="pagexml.html" class="btn btn-neutral float-right" title="pagexml API (version 2019.4.26)" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right"></span></a>
331331

332332

333333
</div>

docs/py-pagexml/objects.inv

0 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)