This document presents a few hints to create application for LibreOffice with Python macros using Py4LO. Some hints are not related to Py4LO.
The Python language is way richer than Basic. But the main difference is the Python standard library. On Windows platforms, this library is restricted but still contains a lot of interesting modules. Those module provide functionalities that are out of reach of Basic.
Note : On Windows platforms, the sqlite3
module is missing, but Py4LO
provides a workaround: the py4lo_sqlite3
module.
Some of the differences:
- Python has a rich standard library whereas LibreOffice Basic has the
Tools
module and theScriptForge
library; - Python is an object oriented programming language whereas Basic is not, even
if you can cheat with
Option Compatible
; - Python has functional programming abilities (functions are variables), Basic has not;
- Python has a powerful exception mechanism, whereas Basic has
On Error GoTo
; - Python has standard data structures (lists, sets, dicts, heaps...) whereas
Basic has arrays and poor
Collections
(or the EnumerableMap API from LibreOffice); - Python provides plain text, XML and JSON input/output functions whereas Basic is limited;
- Python has a powerful logging module, whereas Basic has
MessageBox
; - Python is supported by IDEs (PyCharm is my favourite).
The API is the same for Python or Basic. If you know how to write macros in Basic, you'll see that most of the code will be very similar. But when you create macros using Python, you need to pay more attention to the API, since the methods and fields are case sensitive.
Something like:
oSheet = oDoc.currentcontroller
wont' work, because nor the field currentcontroller
neither the function
getcurrentcontroller()
exists in the API. But getCurrentController()
(and the alias CurrentController
) exists:
oSheet = oDoc.CurrentController
Will work.
If the fnuction has no parameter, don't forget the parenthesis:
s = my_function
will assign the function to s
: s
is now a function, not the result
of a function. Always add parenthesis:
s = my_function()
The typical form in standard Python is:
r = get_resources()
try:
do_wathever_you_want(r)
except Exception as e:
...
finally:
free_resources(r)
When possible, you should use the with
statement (see
PEP 343 and @contextmanager
that ensures that the resources are freed).
with get_resources() as r:
try:
do_wathever_you_want(r)
except Exception as e:
...
Pytohn has a lot of exceptions that are part of the control flow. These are not errors. Let's have a look at an example:
try:
i = s.index("foo")
except ValueError:
... # "foo" was not found
else:
... # "foo" was found at index i
Is the same as:
i = s.find("foo")
if i == -1
... # "foo" was not found
else:
... # "foo" was found at index i
Python uses a lot of control flow exceptions, because of the
EAFP philosophy. Common
examples of "EAFP"s: casts (int
, float
...), search in strings or
sequences, use of attributes that may not exist, decode from bytes to string...
Python provides a simple logging
module. You first need to configure the
logger, then use it like this:
class Foo:
_logger = logging.getLogger(__name__)
...
def f(self, x: int):
self._logger.debug("f(%s) was called", i)
...
Py4LO has a project structure inspired by Maven (the Apache tool "that can manage a project's build, reporting and documentation").
The directory structure is:
src/ main/ test/
The main directory must contain an entry point, typically main.py, and may contain subdirectories. The three tier architecture is a good way to isolate the application code from LibreOffice. We'll see that this is a requirement to produce a testable and maintainable code. Let's create this structure:
src/ main/ application/ data/ domain/ main.py presentation/ dialog/ test/ test_application/ test_data/ test_domain/ test_presentation/ test_dialog/
The dialog directory is a part of the presentation tier that contains with dialog factories.
The domain contains objects from the domain.
I usually have a FrontEnd
class, in the presentation tier, and a
Controller
class in the application tier.
The front end is responsible for creating dialog and listener, and call
the controller that does the job: manipulating the domain objects and loading
and storing them using classes of the data tier:
FrontEnd ---------------- presentation tier uses | AbstractController | ^ | | implements | | | depends on ConcreteController ------------------------ domain application tier uses | AbstractDataStore | ^ | | implements | | | data tier ConcreteDataStore -----------
Py4LO uses an entry point to insert some boilerplate code (see
https://github.com/jferard/py4lo/blob/master/inc/py4lo_import.py for the
boilerplate code) to add the scripts of the document to the sys.path
.
# py4lo: entry
Once the entry point is declared, you can import any embedded module.
The entry point is the main.py
file, and calls the front end to
open the main dialog.
Py4LO provides two directives to embed modules in the LibreOffice document:
# py4lo: embed lib py4lo_typing
# py4lo: embed script a_script
The directive embed lib
embeds modules from the Py4LO standard library.
The directive embed script
embeds scripts or modules located in the
src/opt
directory.
In Basic, you create a command button using the Forms Controls. See Adding a Command Button to a Document. But the LibreOffice interface does not provide a simple way to assign a Python macro to the button.
Py4LO provides a way to create button from functons. If you run the
py4lo init
command, Py4Lo will read the sources files to find functions,
and create a button for each function. Take the new-project.ods
document
and start with this document.
Otherwise, you'll have to edit the XML file content.xml
.
First, create the button and assign whatever Basic macro you want to the "execute" action. LibreOffice will set an URL for the action (see com.sun.star.uri.XVndSunStarScriptUrl for details on the URL format).
Open the ODS file with an utility to decompress zip files (Ark, File Roller,
7-Zip...). You'll the content.xml
file at the root level. Edit this file
and search the string "vnd.sun.star.script".
You will find a sequence like this one:
<script:event-listener script:language="ooo:script" script:event-name="form:performaction" xlink:href="vnd.sun.star.script:[...]?language=Basic&location=document" xlink:type="simple"/>
You have to change the xlink:href
attribute to something like:
"vnd.sun.star.script:main.py$function?language=Python&location=document"
and put the modified content.xml
file into the archive.
The main information source about the API is the LibreOffice SDK API Reference. Bookmark this link and use it!
There are two extensions that you might have used in Basic: XRay and MRI.
Py4LO provides the functions xray
and mri
(see py4lo_helper
module).
Those functions will fail if the extensions are not installed.
Here's a code snippet to use XRay:
oScriptProvider = oDoc.getScriptProvider()
oScript = oScriptProvider.getScript(
"vnd.sun.star.script:XrayTool._Main.Xray?"
"language=Basic&location=application")
oScript.invoke((obj,), tuple(), tuple())
Error handling and resources management is quite cumbersome in Basic
(On Error GoTo ...
is not easy to use).
Now that, coming from Basic, you discovered the try
/ except
statement,
you imagine that you'll be able to control everything by fixing errors as soon
as they are raised. That may be optimistic.
We've alreay seen exceptions that are part of the control flow (see EAFP). Those exceptions are handled as soon as they are raised. But what about other exceptions (actual errors: network not available, not enough disk space, data of unexpected type...)?
Usually, the most sensible thing to do is to log errors to be able to understand what went wrong and quit gracefully. That may seem frustrating.
You are not convinced? You want your function that writes a 1kb file to catch OSError, MemoryError and other kind of errors. Who knows? That might happen and you are never too careful.
Okay. Think of responsibility. Is a tiny function responsible for the disk space, the memory and a bunch of other "big" things that might go wrong? You'd think that is the OS that is responsible for that, not the function! And you can't, most of the time, fix the unexpected situation: it's too late. Too late for asking the user to correct the data, free some disk space or something like that.
Rule of thumb: If you think that a function is responsible for handling an error, ask yourself if the calling function is not more responsible than the called function.
If you follow this rule of thumb, then you'll discover that the top function is often responsible for handling the errors. Thus, always wrap your entry points (macro assigned to a function or event listeners) with this code:
try:
...
except Exception as e:
self._logger.exception("Something bad happened!")
message_box("Error", "Contact me and send the log file")
To import constants, use the following syntax:
class MessageBoxButtons:
from com.sun.star.awt import (BUTTONS_OK, BUTTONS_OK_CANCEL, ...)
Note: the wildcard import won't work.
Python offers a powerful unittest_module. You can mock objects, including LibreOffice API objects, to test your code.
But Python won't be able to find LibreOffice objects (services, interfaces, structures, constants) during the test:
from com.sun.star.awt import XActionListener
Will raise an ImportError
outside of the LibreOffice runtime. Does that
mean that the code is not testable? No.
First of all, you should isolate the "application" code (see 3-tier architecture) from the LibreOffice runtime. That means that application code does not rely on "com.sun.star" imports, and thus should be testable.
But what about the "gui" and the "data" code? You can make them testable with a simple trick:
try:
from com.sun.star.awt import XActionListener
except ImportError:
class XActionListener:
pass
The try / except
statement means: if we are in LibreOffice runtime, then
import the objects, else create a void object to make the code runnable.
Py4LO provides two hidden modules: _mock_constants
and _mock_objects
,
to help you:
try:
from com.sun.star.awt import XActionListener
except ImportError:
from _mock_objects import XActionListener
Those module do not contain the whole LibreOffice API, but only the most
useful constants and objects to me. When your project grows, you may create
a dialog_util.py
module that does this:
try:
import uno
import unohelper
from com.sun.star.awt import XActionListener
class MessageBoxButtons:
from com.sun.star.awt import (BUTTONS_OK, BUTTONS_OK_CANCEL, ...)
except ImportError:
from _mock_objects import uno, unohelper, XActionListener
from _mock_constants import MessageBoxButtons
And never import anything from "com.sun.star" out of this module. In the other modules, just use:
from dialog_util import uno, unohelper, XActionListener, MessageBoxButtons...
Py4Lo does not provide any debugging tool.
In Basic, the switch between the IDE and the running macro is very easy: as soon as an error is raised, you are in the IDE at the line of code, and you can fix it.
In Python, unless you use APSO, you will have a murky message and no way to edit the code in place. Therefore, you'll have to limited the back and forth between the code and the execution.
The Py4Lo solution is to avoid debugging: test and log.
Py4LO can configure the logger:
api_py4lo_commons = Commons.create(XSCRIPTCONTEXT)
log_file = api_py4lo_commons.join_dir("foo.log") # foo.log in the same directory as the document
api_py4lo_commons.init_logger(
log_file, mode='w',
level=logging.DEBUG,
fmt='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
)
You can build a dialog from scratch (py4lo_dialogs
provides some functions
that will help you).
You can also use provider.get_dialog("Standard.mydialog")
to get a dialog
built with the LibreOffice dialog editor.
In basic, you assign macros to the dialog elements. In Python, it's a common practice to programmatically add a listener:
def create_dialog(dialog_name: str) -> UnoControl:
oDialog = provider.get_dialog("Standard.mydialog")
oOkListener = OkListener(...)
oOkButton = oDialog.getControl("ok_button")
oOkButton.addActionListener(oOkListener)
See next section for more details about OkListener
.
In Basic, you assign a function to each button or widget of the dialog. In Python, you won't use this method, but add programmatically the listener to a widget.
class OkListener(unohelper.Base, XActionListener):
_logger = logging.getLogger(__name__)
def __init__(self, front_end: FrontEnd, oDialog: Any):
self._front_end = front_end
self._oDialog = oDialog
def actionPerformed(self, e):
self._logger.debug("Ok")
try:
self._front_end.do_something()
self._oDialog.endExecute()
except:
self._logger.exception("Ok")
LibreOffice calls a Python script when:
- the user calls a macro from Tools > Macros > Run macro menu ;
- the user hits a button having a linked macro ;
- the user selects an item linked to a macro from a custom menu;
- the user triggers an event in a dialog.
When the Python script starts, LibreOffice is interrupted, waiting for the end of the script.
We may have the following sequence:
user hits user hits the a LO button the dialog is shown dialog button LibreOffice: ------| |-------------------------------| | <-- linked script is called | | <--- listener calls Python script. Python : |-----------------------------| |------------------------ script creates a dialog, python script adds a button with a listener
But what if I want to interact with LibreOffice component (document, sheets, cells, infobar...) while the script is running? This won't have any visible effect until the script ends and LibreOffice takes back the control.
Try this function:
def func(*_args):
oDoc = provider.doc
oSheet = oDoc.CurrentController.ActiveSheet
oSheet.getCellByPosition(0, 0).String = "foo"
time.sleep(1.0)
oSheet.getCellByPosition(0, 0).String = "bar"
time.sleep(1.0)
oSheet.getCellByPosition(0, 0).String = "baz"
time.sleep(1.0)
You may think that the "foo", "bar", baz" string will appear successively in cell A1 of the current active sheet instantly, but it won't. It will take 3 seconds to leave the Python script and during this 3 seconds, nothing will change in the sheet ( LibreOffice "hangs"). Then the last string, "baz", will appear in the cell A1.
Side note: if you understand this, you won't use oDoc.lockControllers() and oDoc.unlockControllers() anymore in Python: the interface is already locked.
There is a reason why you might want to update the LibreOffice components during a script : when running a long script, you need to inform the user of what's happening. Furthermore, we don't like when LibreOffice hangs for a long time.
There is a solution: Python threads. If we start a thread in the Python script, but do not wait until the thread finished (ie do not use Thread.join()), then LibreOffice will take the control back, but the Python thread will continue tu be executed:
user hits main script end of end of a LO button creates a thread main script thread script LibreOffice: ------| | |------------------ ... -----------------------|------------------- | <-- linked script is called | | ^ | Python : |-----------------------------|--------------| updates LibreOffice component | main script | | | Python thread : |--------------------------------- ... -----------------------| thread script
This function will give the expected result (write "foo", wait 1 second, write "bar", wait 1 second, write "baz", wait 1 second):
def func(*_args):
oDoc = provider.doc
oSheet = oDoc.CurrentController.ActiveSheet
def aux():
oSheet.getCellByPosition(0, 0).String = "foo"
time.sleep(1.0)
oSheet.getCellByPosition(0, 0).String = "bar"
time.sleep(1.0)
oSheet.getCellByPosition(0, 0).String = "baz"
time.sleep(1.0)
t = threading.Thread(target=aux)
t.start()
Use this method if you want to update a dialog, an infobar, etc.
Beware: if there is an error in a Python thread, LibreOffice won't show any error message.
In Basic, you have to use a trick to create event listeners: the function
CreateUnoListener
takes a prefix
parameter that is used to declare
the methods of the listener.
In Python, you just need to subclass the right XEventListener
sub interface.
See an example with the XActionListener
interface:
class MyActionListener(unohelper.Base, XActionListener):
def actionPerformed(self, e):
... do someting
And somewhere else:
oButton = oDialog.getControl("button")
obutton.addActionListener(MyActionListener())
The unohelper.Base
class provides a default implementation for
com.sun.star.lang.XTypeProvider
, and the subclass MyActionListener
has to provide the actionPerformed
method implementation.
If you want the main thread to be informed of something, just use an observer in MyActionListener. Example:
class MyActionListener(unohelper.Base, XActionListener):
def __init__(self, observer):
self._observer = observer
def actionPerformed(self, e):
... do someting
observer.notify("hey, I'm the event listener")
If an unexpected exception is raised inside the event listener method (actionPerformed here), then the event thread ends abruptly and no one will know about the exception: neither the event thread nor the main thread. Since, ignoring an exception is not usually a good pratice ( although that may be the right choice in some cases), you will end adding boilerplate code to each event listener method:
class ActionListener(unohelper.Base, XActionListener):
def actionPerformed(self, e):
try:
... do someting
except:
logging.getLogger(__name__).exception("Exception")
You might also want to trace when the method is called. Py4LO provides a useful decorator to make both:
class ActionListener(unohelper.Base, XActionListener):
@trace_event(__name__)
def actionPerformed(self, e):
... do someting
If you don't want to trace the enter/exit events, then use:
class TextListener(unohelper.Base, XTextListener):
@trace_event(__name__, enter_exit=False)
def textChanged(self, e):
... do someting
The application tier contains the logic of the program. It should be independant from the presentation tier and depend on interfaces, not on the data tier.:
presentation tier application tier data tier
In LO Basic, you have to load the file (loadComponentFromURL
) and process
the DataArray. In Python, you can use the
csv
module to parse the file.
Assuming that each row represents a MyObject
instance (MyObject
is
part of the domain), you can create
a MyObjectHelper
with some useful methods. The first method is
MyObjectHelper.from_row(row: List[str]) -> MyObject
. This method creates a
new object, with typed fields, from a row.
It is very important build a
consistent Python object a soon as possible: parse the dates, detect enum values,
parse booleans, decide whether a void value is None
or an empty string.
Once the data is typed and the object built, you can work with it (and test it)
out of LibreOffice.
The reverse method, MyObjectHelper.to_row(obj: MyObject) -> List[str]
may be useful to store an object in a CSV file.
Sometimes, you have to build objects on top of several rows. It's common when you have a denormalized CSV file. In this case you have to build first records, that are simple typed representations of the data rows. And then build object from these records.
The design is roughly the same:
MyRecordHelper.from_row(row: List[str]) -> MyRecord
will build the
record. A record is a simple dataclass <https://docs.python.org/3/library/dataclasses.html>_
but, fields are typed. On top of these records, you have
MyObjectHelper.from_records(recs: List[MyRecord]) -> MyObject
that builds
an object from records:
recs = [
MyRecordHelper.from_row(row)
for row in csv_reader
]
rec_by_name = {}
for rec in recs:
rec_by_name.setdefault(rec.name, []).append(rec)
objs = [
MyObjectHelper.from_records(recs)
for name, recs in rec_by_name.items()
]
A DataArray is an array of arrays of values mapped to a SheeCellRange (see https://api.libreoffice.org/docs/idl/ref/interfacecom_1_1sun_1_1star_1_1sheet_1_1XCellRangeData.html and https://wiki.documentfoundation.org/Documentation/DevGuide/Spreadsheet_Documents#Data_Array). In Python, values of a DataArray are:
str
instances for text cellsfloat
instances for doubles, dates, hours, integers, boolean, percents, currencies, fractions.None
for#N/A
values.
The general idea is the same as when you Process a CSV file: build a consistent Python object as soon as possible. You might have to build records before you build objects.
Py4LO provides py4lo_sqlite3 a module to work with SQLite databases, because the standard Python sqlite3 module is missing on Windows systems. The py4lo_sqlite3 module is low-level and does not comply with the PEP249, but it provides some useful functions.
Py4LO provides also py4lo_base, a module to work with LibreOffice Base documents.
The general idea is the same as when you Process a CSV file: build a consistent Python object as soon as possible. You might have to build records before you build objects.
Once objects or records are built, you'll need some methods to write them into a CSV file, a DataArray or a database:
MyObjectHelper.to_row(obj: MyObject) -> List[str]
MyObjectHelper.to_data_row(obj: MyObject) -> DATA_ROW
Once you have Python objects, you can store them in one or several tables.
A simple Python object may be stored in a table. Create a
MyObjectHelper.bind(stmt: Sqlite3Statement, obj: MyObject)
to
bind the fields of the object to the columns of the table.
with sqlite_open(self._path, "rw") as db:
with db.transaction():
with db.prepare("INSERT INTO table VALUES(?, ?, ?, ?)") as stmt:
for obj in objs:
stmt.reset()
MyObjectHelper.bind(stmt, obj)
stmt.execute_update()
When object are more than simple records, a minimal normalization (see for
instance https://en.wikipedia.org/wiki/Boyce%E2%80%93Codd_normal_form)
is recommended. Use an abstract SQLBond
class, with a
SQLBond.bind(stmt: Sqlite3Statement)
method. Each SQLBond
is able
to bind variables to a statement. The method
MyObjectHelper.table1_bonds(obj) -> List[SQLBond]
returns a list
of bonds for the object:
with sqlite_open(self._path, "rw") as db:
with db.transaction():
with db.prepare("INSERT INTO table1 VALUES(?, ?, ?, ?)") as stmt:
for obj in objs:
for bond in MyObjectHelper.table1_bonds(obj):
stmt.reset()
bond.bind(stmtobj)
stmt.execute_update()
Create as many MyObjectHelper.table<n>_bonds(obj) -> List[SQLBond]
as
necessary.
Now that the objects are correctly stored, you can use the full power of SQL queries. To handle the result of those queries,
A classical need is to load data from a CSV file or a DataArray to a SQLite database, or from a SQLite database to a DataArray or a CSV file.
If you come from LibreOffice Basic, you might think that keeping the
storage formats as close as possible is the best solution. It is not.
If you load a CSV file into a database, don't store values as TEXT
in
the database. If you load a DataArray, don't store values as
TEXT
or DOUBLE
in the database.
You may use a SQLite database to store a DataArray. Values of a DataArray are strings, floats or Nones. Since this is raw data, SQL capacities are not really available. You can:
SELECT * FROM <table>
and put it back into a DataArray ;SELECT SUM(<column>) FROM <table>
to check the sum of the values of a column ;- do other basic checks.
But you can't:
- use the SQLite date and time functions since dates in a DataArray are
a number of days since
oDoc.NullDate
. - rely on such a raw data to do complex queries: you'll need a more accurate
typing:
NULL
is#N/A
, but what about an empty string? Should it be treated as aNULL
value?- What about bools or integers? They are mixed with floats.
- How to use
typeof(...)
?
Unless you use the database as a temporary storage or to do some basic check on millions of data rows, you have to do a little more.
The solution is to build domain objects then store them:
objs = [
MyObjectHelper.from_row(row)
for row in csv_reader
]
with sqlite_open(self._path, "rw") as db:
with db.transaction():
with db.prepare("INSERT INTO table VALUES(?, ?, ?, ?)") as stmt:
for obj in objs:
stmt.reset()
MyObjectHelper.bind(stmt, obj)
stmt.execute_update()
This ensures that the database is a correct representation of the objects, not of the raw data. This may seem overkill, but it has a lot of advantages:
- It comes for free because it uses functions that you have already written ;
- It allows you to check the values (e.g. sum of a column) ;
- SQL queries are easy to use.
Summary:
DOMAIN +---> Objects ---+ / ^ \ ---------------------/----------|---------\---------------- / | \ /-------> Records \ DATA / v CSV File - // RAW DATA // -> SQLite database
If you try to use the bottom path (RAW DATA
), you may experience some hard
times.
Todo.