Skip to content

Commit 34c7344

Browse files
committed
Refactor argument passing
- Refactor argument passing so that instead of implicitly-spawned `ImplementedDataInfo` objects, there are actual arguments (for automatic offsets and strides, base storage, and `sep`-tagged arrays). It also centralizes the logic for what goes into argument lists, instead of having various "filtered" versions scattered about. - Get started on type-annotating a bit of loopy. - Switch a not-small number of data structures to be dataclasses, notably `LoopKernel`. - Drop OCCA support from the ISPC target. (I'm not aware of any users, ever.) - Drop the Numba target outright. (I'm not aware of any users, ever.) - Drop `LoopKernel.local_sizes`, which was usable to directly set the workgroup size. (I'm not aware of any users, ever.) - Expire the deprecation for `iname_to_tags`. - Bumps the Python compatibility target to 3.8, for `from __future__ import annotations` and `cached_property` (mypy does not support nested decorators) - Bug fix: `tags` was not part of `LoopKernel.hash_fields` - Bug fix: `InstructionBase.get_write_dependency_names()` was used to find written variables, `InstructionBase.assignee_var_names()` is correct - Bug fix: KernelExecutorBase now uses linearize() so as to not bypass pre-linearization checks (cf. gh-639)
1 parent ffa29ab commit 34c7344

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+2971
-2653
lines changed

.github/workflows/ci.yml

+14-1
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ jobs:
1717
uses: actions/setup-python@v1
1818
with:
1919
# matches compat target in setup.py
20-
python-version: '3.6'
20+
python-version: '3.8'
2121
- name: "Main Script"
2222
run: |
2323
curl -L -O https://gitlab.tiker.net/inducer/ci-support/raw/main/prepare-and-run-flake8.sh
@@ -35,6 +35,19 @@ jobs:
3535
curl -L -O https://gitlab.tiker.net/inducer/ci-support/raw/main/prepare-and-run-pylint.sh
3636
. ./prepare-and-run-pylint.sh "$(basename $GITHUB_REPOSITORY)" test/test_*.py
3737
38+
mypy:
39+
name: Mypy
40+
runs-on: ubuntu-latest
41+
steps:
42+
- uses: actions/checkout@v2
43+
- name: "Main Script"
44+
run: |
45+
curl -L -O https://tiker.net/ci-support-v0
46+
. ./ci-support-v0
47+
build_py_project_in_conda_env
48+
python -m pip install mypy
49+
./run-mypy.sh
50+
3851
pytest:
3952
name: Conda Pytest
4053
runs-on: ubuntu-latest

.gitlab-ci.yml

+12
Original file line numberDiff line numberDiff line change
@@ -168,6 +168,18 @@ Flake8:
168168
except:
169169
- tags
170170

171+
Mypy:
172+
script: |
173+
curl -L -O https://tiker.net/ci-support-v0
174+
. ./ci-support-v0
175+
build_py_project_in_venv
176+
python -m pip install mypy
177+
./run-mypy.sh
178+
tags:
179+
- python3
180+
except:
181+
- tags
182+
171183
Benchmarks:
172184
stage: test
173185
script:

doc/conf.py

+19
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,25 @@
3535
"https://pyrsistent.readthedocs.io/en/latest/": None,
3636
}
3737

38+
# Some modules need to import things just so that sphinx can resolve symbols in
39+
# type annotations. Often, we do not want these imports (e.g. of PyOpenCL) when
40+
# in normal use (because they would introduce unintended side effects or hard
41+
# dependencies). This flag exists so that these imports only occur during doc
42+
# build. Since sphinx appears to resolve type hints lexically (as it should),
43+
# this needs to be cross-module (since, e.g. an inherited arraycontext
44+
# docstring can be read by sphinx when building meshmode, a dependent package),
45+
# this needs a setting of the same name across all packages involved, that's
46+
# why this name is as global-sounding as it is.
47+
import sys
48+
sys._BUILDING_SPHINX_DOCS = True
49+
3850
nitpick_ignore_regex = [
3951
["py:class", r"typing_extensions\.(.+)"],
52+
["py:class", r"numpy\.u?int[0-9]+"],
53+
["py:class", r"numpy\.float[0-9]+"],
54+
["py:class", r"numpy\.complex[0-9]+"],
55+
56+
# As of 2022-06-22, it doesn't look like there's sphinx documentation
57+
# available.
58+
["py:class", r"immutables\.(.+)"],
4059
]

doc/ref_kernel.rst

-10
Original file line numberDiff line numberDiff line change
@@ -515,24 +515,14 @@ Arguments
515515
^^^^^^^^^
516516

517517
.. autoclass:: KernelArgument
518-
:members:
519-
:undoc-members:
520518

521519
.. autoclass:: ValueArg
522-
:members:
523-
:undoc-members:
524520

525521
.. autoclass:: ArrayArg
526-
:members:
527-
:undoc-members:
528522

529523
.. autoclass:: ConstantArg
530-
:members:
531-
:undoc-members:
532524

533525
.. autoclass:: ImageArg
534-
:members:
535-
:undoc-members:
536526

537527
.. _temporaries:
538528

doc/ref_transform.rst

+2
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,8 @@ Influencing data access
5252

5353
.. automodule:: loopy.transform.privatize
5454

55+
.. autofunction:: allocate_temporaries_for_base_storage
56+
5557
Padding Data
5658
------------
5759

doc/tutorial.rst

+9-9
Original file line numberDiff line numberDiff line change
@@ -235,7 +235,7 @@ inspect that code, too, using :attr:`loopy.Options.write_wrapper`:
235235
if allocator is None:
236236
allocator = _lpy_cl_tools.DeferredAllocator(queue.context)
237237
<BLANKLINE>
238-
# {{{ find integer arguments from shapes
238+
# {{{ find integer arguments from array data
239239
<BLANKLINE>
240240
if n is None:
241241
if a is not None:
@@ -1228,11 +1228,11 @@ should call :func:`loopy.get_one_linearized_kernel`:
12281228
...
12291229
---------------------------------------------------------------------------
12301230
LINEARIZATION:
1231-
0: CALL KERNEL rotate_v2(extra_args=[], extra_inames=[])
1231+
0: CALL KERNEL rotate_v2
12321232
1: tmp = arr[i_inner + i_outer*16] {id=maketmp}
12331233
2: RETURN FROM KERNEL rotate_v2
12341234
3: ... gbarrier
1235-
4: CALL KERNEL rotate_v2_0(extra_args=[], extra_inames=[])
1235+
4: CALL KERNEL rotate_v2_0
12361236
5: arr[(1 + i_inner + i_outer*16) % n] = tmp {id=rotate}
12371237
6: RETURN FROM KERNEL rotate_v2_0
12381238
---------------------------------------------------------------------------
@@ -1260,18 +1260,18 @@ put those instructions into the schedule.
12601260
...
12611261
---------------------------------------------------------------------------
12621262
TEMPORARIES:
1263-
tmp: type: np:dtype('int32'), shape: () aspace:private
1264-
tmp_save_slot: type: np:dtype('int32'), shape: (n // 16, 16), dim_tags: (N1:stride:16, N0:stride:1) aspace:global
1263+
tmp: type: np:dtype('int32'), shape: () aspace: private
1264+
tmp_save_slot: type: np:dtype('int32'), shape: (n // 16, 16), dim_tags: (N1:stride:16, N0:stride:1) aspace: global
12651265
---------------------------------------------------------------------------
12661266
...
12671267
---------------------------------------------------------------------------
12681268
LINEARIZATION:
1269-
0: CALL KERNEL rotate_v2(extra_args=['tmp_save_slot'], extra_inames=[])
1269+
0: CALL KERNEL rotate_v2
12701270
1: tmp = arr[i_inner + i_outer*16] {id=maketmp}
12711271
2: tmp_save_slot[tmp_save_hw_dim_0_rotate_v2, tmp_save_hw_dim_1_rotate_v2] = tmp {id=tmp.save}
12721272
3: RETURN FROM KERNEL rotate_v2
12731273
4: ... gbarrier
1274-
5: CALL KERNEL rotate_v2_0(extra_args=['tmp_save_slot'], extra_inames=[])
1274+
5: CALL KERNEL rotate_v2_0
12751275
6: tmp = tmp_save_slot[tmp_reload_hw_dim_0_rotate_v2_0, tmp_reload_hw_dim_1_rotate_v2_0] {id=tmp.reload}
12761276
7: arr[(1 + i_inner + i_outer*16) % n] = tmp {id=rotate}
12771277
8: RETURN FROM KERNEL rotate_v2_0
@@ -1297,15 +1297,15 @@ The kernel translates into two OpenCL kernels.
12971297
#define lid(N) ((int) get_local_id(N))
12981298
#define gid(N) ((int) get_group_id(N))
12991299
<BLANKLINE>
1300-
__kernel void __attribute__ ((reqd_work_group_size(16, 1, 1))) rotate_v2(__global int *__restrict__ arr, int const n, __global int *__restrict__ tmp_save_slot)
1300+
__kernel void __attribute__ ((reqd_work_group_size(16, 1, 1))) rotate_v2(__global int const *__restrict__ arr, int const n, __global int *__restrict__ tmp_save_slot)
13011301
{
13021302
int tmp;
13031303
<BLANKLINE>
13041304
tmp = arr[16 * gid(0) + lid(0)];
13051305
tmp_save_slot[16 * gid(0) + lid(0)] = tmp;
13061306
}
13071307
<BLANKLINE>
1308-
__kernel void __attribute__ ((reqd_work_group_size(16, 1, 1))) rotate_v2_0(__global int *__restrict__ arr, int const n, __global int *__restrict__ tmp_save_slot)
1308+
__kernel void __attribute__ ((reqd_work_group_size(16, 1, 1))) rotate_v2_0(__global int *__restrict__ arr, int const n, __global int const *__restrict__ tmp_save_slot)
13091309
{
13101310
int tmp;
13111311
<BLANKLINE>

loopy/__init__.py

+11-9
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,8 @@
9595
alias_temporaries, set_argument_order,
9696
rename_argument,
9797
set_temporary_scope,
98-
set_temporary_address_space)
98+
set_temporary_address_space,
99+
allocate_temporaries_for_base_storage)
99100

100101
from loopy.transform.subst import (extract_subst,
101102
assignment_to_subst, expand_subst, find_rules_matching,
@@ -157,7 +158,6 @@
157158
from loopy.target.opencl import OpenCLTarget
158159
from loopy.target.pyopencl import PyOpenCLTarget
159160
from loopy.target.ispc import ISPCTarget
160-
from loopy.target.numba import NumbaTarget, NumbaCudaTarget
161161

162162
from loopy.tools import Optional, t_unit_to_python, memoize_on_disk
163163

@@ -216,6 +216,7 @@
216216
"remove_unused_arguments",
217217
"alias_temporaries", "set_argument_order",
218218
"rename_argument", "set_temporary_scope", "set_temporary_address_space",
219+
"allocate_temporaries_for_base_storage",
219220

220221
"find_instructions", "map_instructions",
221222
"set_instruction_priority", "add_dependency",
@@ -302,7 +303,6 @@
302303
"CWithGNULibcTarget", "ExecutableCWithGNULibcTarget",
303304
"CudaTarget", "OpenCLTarget",
304305
"PyOpenCLTarget", "ISPCTarget",
305-
"NumbaTarget", "NumbaCudaTarget",
306306
"ASTBuilderBase",
307307

308308
"Optional", "memoize_on_disk",
@@ -366,7 +366,7 @@ def set_options(kernel, *args, **kwargs):
366366
# {{{ library registration
367367

368368
@for_each_kernel
369-
def register_preamble_generators(kernel, preamble_generators):
369+
def register_preamble_generators(kernel: LoopKernel, preamble_generators):
370370
"""
371371
:arg manglers: list of functions of signature ``(preamble_info)``
372372
generating tuples ``(sortable_str_identifier, code)``,
@@ -376,7 +376,8 @@ def register_preamble_generators(kernel, preamble_generators):
376376
"""
377377
from loopy.tools import unpickles_equally
378378

379-
new_pgens = kernel.preamble_generators[:]
379+
new_pgens = tuple(kernel.preamble_generators)
380+
380381
for pgen in preamble_generators:
381382
if pgen not in new_pgens:
382383
if not unpickles_equally(pgen):
@@ -385,7 +386,7 @@ def register_preamble_generators(kernel, preamble_generators):
385386
"and would thus disrupt loopy's caches"
386387
% pgen)
387388

388-
new_pgens.insert(0, pgen)
389+
new_pgens = (pgen,) + new_pgens
389390

390391
return kernel.copy(preamble_generators=new_pgens)
391392

@@ -394,7 +395,7 @@ def register_preamble_generators(kernel, preamble_generators):
394395
def register_symbol_manglers(kernel, manglers):
395396
from loopy.tools import unpickles_equally
396397

397-
new_manglers = kernel.symbol_manglers[:]
398+
new_manglers = kernel.symbol_manglers
398399
for m in manglers:
399400
if m not in new_manglers:
400401
if not unpickles_equally(m):
@@ -403,7 +404,7 @@ def register_symbol_manglers(kernel, manglers):
403404
"and would disrupt loopy's caches"
404405
% m)
405406

406-
new_manglers.insert(0, m)
407+
new_manglers = (m,) + new_manglers
407408

408409
return kernel.copy(symbol_manglers=new_manglers)
409410

@@ -484,7 +485,8 @@ def make_copy_kernel(new_dim_tags, old_dim_tags=None):
484485
result = make_kernel(set_str,
485486
"output[%s] = input[%s]"
486487
% (commad_indices, commad_indices),
487-
lang_version=MOST_RECENT_LANGUAGE_VERSION)
488+
lang_version=MOST_RECENT_LANGUAGE_VERSION,
489+
default_offset=auto)
488490

489491
result = tag_array_axes(result, "input", old_dim_tags)
490492
result = tag_array_axes(result, "output", new_dim_tags)

0 commit comments

Comments
 (0)