From fbf6ced7efba4e307242379b04d56236c5ae24b9 Mon Sep 17 00:00:00 2001 From: niry1 Date: Mon, 13 Apr 2026 17:33:39 +0300 Subject: [PATCH 01/50] Start work on 1.0.0 --- .github/workflows/c-cpp.yml | 4 ++-- README.md | 12 +++++------- manpage | 2 +- specs/Directory.Build.props | 4 ++-- 4 files changed, 10 insertions(+), 12 deletions(-) diff --git a/.github/workflows/c-cpp.yml b/.github/workflows/c-cpp.yml index 5017d30..bdfffe2 100644 --- a/.github/workflows/c-cpp.yml +++ b/.github/workflows/c-cpp.yml @@ -2,9 +2,9 @@ name: C/C++ CI on: push: - branches: [ dev, stable, dev-0.9.9, dev-1.0.0, dev-packaging] + branches: [ dev, stable, dev-0.9.9, dev-1.0.0] pull_request: - branches: [ dev, stable, dev-0.9.9, dev-1.0.0, dev-packaging] + branches: [ dev, stable, dev-0.9.9, dev-1.0.0] env: SPECS_BRANCH: ${{ github.event.pull_request.base.ref }} diff --git a/README.md b/README.md index a5f7d49..be874c7 100644 --- a/README.md +++ b/README.md @@ -10,6 +10,11 @@ This version is liberally based on the [**CMS Pipelines User's Guide and Referen News ==== +11-Sep-2026: Version 1.0.0 is here + +What's new: + * Various TBD improvements +*** 1-May-2026: Version 0.9.9 is here What's new: @@ -25,13 +30,6 @@ What's new: *Note:* Installing from package does not include Python support on Windows. *Note:* On Linux, the `specs` binary is bigger when installed from package, as it is statically linked with libstdc++. -*** -28-Feb-2026: Version 0.9.6 is here - -What's new: - * Support for newer Linux distros (newer gcc) - * Support for Visual Studio and latest Windows versions - * Alignment with C++ coding standards Sources ======= diff --git a/manpage b/manpage index 9e29be7..445f60f 100644 --- a/manpage +++ b/manpage @@ -1,7 +1,7 @@ .\" Manpage for specs. .\" Open an issue at https://github.com/yoavnir/specs2016 to correct errors or typos .mso www.tmac -.TH man 1 "1 May 2026" "0.9.9" "specs man page" +.TH man 1 "11 Sep 2026" "1.0.0" "specs man page" .SH NAME specs \- a text processing tool .SH SYNOPSIS diff --git a/specs/Directory.Build.props b/specs/Directory.Build.props index 1b0821b..b448234 100644 --- a/specs/Directory.Build.props +++ b/specs/Directory.Build.props @@ -1,10 +1,10 @@ - 0.9.9 + 1.0.0 specs A re-writing of the specs pipeline stage from CMS, only changed quite a bit Copyright (c) 2018-2026 Yoav Nir $(GitTag) - 0,9,9,0 + 1,0,0,0 From 2fbcb371a4edec7fc97167fd6631378153cdc7a0 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Wed, 29 Apr 2026 08:45:46 +0300 Subject: [PATCH 02/50] Update README.md for 0.9.9 GA (#361) --- README.md | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index be874c7..c4c9825 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,7 @@ What's new: * .pkg package for Mac OS * RPM for Linux * .deb package for Ubuntu/Debian - * Homebrew formula + * Visual Studio infra for building for Windows * Improved guessing of Python version * New spec units: `SPLITW` and `SPLITF` for splitting input records by words or fields into multiple output records. These new spec units support optional custom separators, `OF` clauses (with the same semantics as SUBSTRING), and range output placement (e.g. `splitw 1-10`). * A more exact `exact()` function @@ -37,6 +37,15 @@ To download your copy of *specs*, you can get it from [github](https://github.co 1. Using git: `git clone https://github.com/yoavnir/specs2016.git` 2. Using http: `wget https://github.com/yoavnir/specs2016/archive/dev.zip` +Installation from binaries +========================== +The binaries for the latest release can be downloaded from [**the release page**](https://github.com/yoavnir/specs2016/releases/tag/v0.9.9) + +Limitations: + * You will not get any Python support for Python integration on Windows + * You may get an older version of Python for Python integration on other platforms + * No support for exotic OS-es like Windows on ARM. + Building ======== If you have downloaded a git repository, first make sure to check out a stable tag such as v0.9.9: @@ -54,7 +63,10 @@ After that, _cd_ to the specs/src directory, and run the following three command * `make some` * `sudo make install` -*Note:* Windows does not need `sudo`. +*Note:* For Microsoft Windows, you can use **MSBuild** as follows: +* Start from the repository directory (do not _cd_ to specs/src) +* `msbuild specs/specs.sln /p:Configuration=Release /p:Platform=x64` +* Now copy the resulting `specs.exe` to a target directory in the path. *Note:* Only Python 3 is supported at this point. To enable Python support, you need to have the `python3-devel` package that matches your python version installed. From ca4ac1451d1ce45404885ff1b4a02f29f21a2f88 Mon Sep 17 00:00:00 2001 From: niry1 Date: Wed, 6 May 2026 11:56:58 +0300 Subject: [PATCH 03/50] Issue #363 - Allow Python in Windows build --- BUILDING.md | 96 ++++++++++++++++++++++++++++++++ README.md | 24 +------- specs/Directory.Build.props | 6 ++ specs/Directory.Build.targets | 71 +++++++++++++++++++++++ specs/src/ALUUnitTest.vcxproj | 4 +- specs/src/ProcessingTest.vcxproj | 4 +- specs/src/TokenTest.vcxproj | 4 +- specs/src/cacheTest.vcxproj | 4 +- specs/src/itemTest.vcxproj | 4 +- specs/src/readWriteTest.vcxproj | 4 +- specs/src/specs.vcxproj | 4 +- specs/src/timeTest.vcxproj | 4 +- 12 files changed, 190 insertions(+), 39 deletions(-) create mode 100644 BUILDING.md create mode 100644 specs/Directory.Build.targets diff --git a/BUILDING.md b/BUILDING.md new file mode 100644 index 0000000..a08ffb2 --- /dev/null +++ b/BUILDING.md @@ -0,0 +1,96 @@ +Building specs +============== + +Sources +------- +To download your copy of *specs*, you can get it from [github](https://github.com/yoavnir/specs2016) in either of two ways: +1. Using git: `git clone https://github.com/yoavnir/specs2016.git` +2. Using http: `wget https://github.com/yoavnir/specs2016/archive/dev.zip` + +Prerequisites +------------- +* A C++17-compatible compiler (GCC, Clang, or MSVC) +* Python 3 (optional, for Python integration support) + * On Linux: the `python3-devel` (or `python3-dev`) package that matches your Python version + * On Mac OS: the Xcode command-line tools (which include Python headers) + * On Windows: a standard Python 3 installation includes the required headers and libraries + +Checking out a stable version +----------------------------- +If you have downloaded a git repository, first make sure to check out a stable tag such as v0.9.9: +``` +git checkout v0.9.9 +``` +A good way to get the latest stable release is to check out the `stable` branch and rebase to its tip: +``` +git checkout stable +git rebase +``` + +Building on Linux and Mac OS (make) +------------------------------------ +Change to the `specs/src` directory, and run the following commands: +1. `python setup.py` -- use `python3` or `python3.x` if your default Python version is 2.7 +2. `make some` +3. `sudo make install` + +The `setup.py` script auto-detects your compiler, Python installation, and platform capabilities. It generates a `Makefile` tailored to your environment. + +### Python support +Python support is detected automatically by `setup.py`. To explicitly control it: +* `python setup.py --python python3.11` -- use a specific Python version +* `python setup.py --python no` -- disable Python support entirely + +Only Python 3 is supported. To enable Python support, you need the `python3-devel` package (or equivalent) that matches your Python version installed. + +### Notes +* On some Mac machines, `sudo make install` will cause a warning about being the wrong user. +* You can pass `-v DEBUG` to `setup.py` to build a debug version. +* You can pass `--static` to `setup.py` to statically link libstdc++ (useful for portable binaries). + +Building on Windows with MSBuild +--------------------------------- +Start from the repository root directory (do **not** change to `specs/src`). + +### Without Python support (default) +``` +msbuild specs\specs.sln /p:Configuration=Release /p:Platform=x64 +``` + +### With Python support +To build with Python support, add `/p:EnablePython=true` to the command line. Python 3 and its development files must be installed on the build machine: +``` +msbuild specs\specs.sln /p:Configuration=Release /p:Platform=x64 /p:EnablePython=true +``` + +Python is auto-detected from the system PATH. If Python is not in your PATH or you want to use a specific installation, provide the installation directory explicitly: +``` +msbuild specs\specs.sln /p:Configuration=Release /p:Platform=x64 /p:EnablePython=true /p:PythonDir=C:\Python312 +``` + +You may also override the detected version numbers if needed: +``` +msbuild specs\specs.sln /p:Configuration=Release /p:Platform=x64 /p:EnablePython=true /p:PythonDir=C:\Python312 /p:PythonVerNoDot=312 /p:PythonFullVer=3.12.0 +``` + +### After building +Copy the resulting `specs.exe` from the `specs\bin\Release\` directory to a location in your PATH. + +### Notes +* With Python support enabled, the appropriate Python DLL (e.g., `python312.dll`) must be in the PATH at runtime. +* To build the Debug configuration, replace `Release` with `Debug` in the commands above. + +Building on Windows with make +----------------------------- +As an alternative to MSBuild, you can use `make` on Windows. Change to the `specs/src` directory and run: +1. `python setup.py -c VS` +2. `make some` + +This approach uses the Visual Studio `cl.exe` compiler via `make` and supports the same `--python` flag as on other platforms. + +Known Issues +------------ +* Regular expression grammars other than the default `ECMAScript` don't work except on Mac OS. +* On Windows with Python support, the appropriate DLL (like `python312.dll`) must be in the PATH. + +*Note:* Although Windows for ARM64 is not officially supported, that platform will run the x64 version just fine. diff --git a/README.md b/README.md index c4c9825..05ef008 100644 --- a/README.md +++ b/README.md @@ -48,29 +48,7 @@ Limitations: Building ======== -If you have downloaded a git repository, first make sure to check out a stable tag such as v0.9.9: -``` -git checkout v0.9.9 -``` -A good way to get the latest stable release is to check out the `stable` branch and rebase to its tip: -``` -git checkout stable -git rebase -``` - -After that, _cd_ to the specs/src directory, and run the following three commands: -* `python setup.py` - use `python3` or `python3.x` if your default Python version is 2.7 -* `make some` -* `sudo make install` - -*Note:* For Microsoft Windows, you can use **MSBuild** as follows: -* Start from the repository directory (do not _cd_ to specs/src) -* `msbuild specs/specs.sln /p:Configuration=Release /p:Platform=x64` -* Now copy the resulting `specs.exe` to a target directory in the path. - -*Note:* Only Python 3 is supported at this point. To enable Python support, you need to have the `python3-devel` package that matches your python version installed. - -*Note:* On some Mac machines, `sudo make install` will cause a warning about being the wrong user. +For detailed build instructions covering Linux, Mac OS, and Windows (both `make` and MSBuild), see [BUILDING.md](BUILDING.md). Known Issues ============ diff --git a/specs/Directory.Build.props b/specs/Directory.Build.props index b448234..6f5f9a0 100644 --- a/specs/Directory.Build.props +++ b/specs/Directory.Build.props @@ -6,5 +6,11 @@ Copyright (c) 2018-2026 Yoav Nir $(GitTag) 1,0,0,0 + + false + SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A; + PYTHON_VER_3; + with Python + diff --git a/specs/Directory.Build.targets b/specs/Directory.Build.targets new file mode 100644 index 0000000..dc96fbf --- /dev/null +++ b/specs/Directory.Build.targets @@ -0,0 +1,71 @@ + + + + + + + <_PythonExe Condition="'$(PythonDir)' != ''">$(PythonDir)\python.exe + <_PythonExe Condition="'$(PythonDir)' == ''">python + + + + + + + + + + + + + + + + + + + + + + + + + $(PythonDir)\include;%(AdditionalIncludeDirectories) + PYTHON_FULL_VER=$(PythonFullVer);%(PreprocessorDefinitions) + + + + + + + + <_PythonLib>$(PythonDir)\libs\python$(PythonVerNoDot).lib + + + + $(_PythonLib);%(AdditionalDependencies) + + + + diff --git a/specs/src/ALUUnitTest.vcxproj b/specs/src/ALUUnitTest.vcxproj index 49a8ac1..a824806 100644 --- a/specs/src/ALUUnitTest.vcxproj +++ b/specs/src/ALUUnitTest.vcxproj @@ -52,7 +52,7 @@ Level3 Disabled true - _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - debug variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - debug variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) @@ -71,7 +71,7 @@ true true true - _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - release variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - release variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) diff --git a/specs/src/ProcessingTest.vcxproj b/specs/src/ProcessingTest.vcxproj index f7980f4..3009b73 100644 --- a/specs/src/ProcessingTest.vcxproj +++ b/specs/src/ProcessingTest.vcxproj @@ -52,7 +52,7 @@ Level3 Disabled true - _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - debug variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - debug variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) @@ -71,7 +71,7 @@ true true true - _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - release variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - release variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) diff --git a/specs/src/TokenTest.vcxproj b/specs/src/TokenTest.vcxproj index 6159c0c..7fa3e9c 100644 --- a/specs/src/TokenTest.vcxproj +++ b/specs/src/TokenTest.vcxproj @@ -52,7 +52,7 @@ Level3 Disabled true - _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - debug variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - debug variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) @@ -71,7 +71,7 @@ true true true - _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - release variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - release variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) diff --git a/specs/src/cacheTest.vcxproj b/specs/src/cacheTest.vcxproj index cc9be37..4709e89 100644 --- a/specs/src/cacheTest.vcxproj +++ b/specs/src/cacheTest.vcxproj @@ -52,7 +52,7 @@ Level3 Disabled true - _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - debug variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - debug variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) @@ -71,7 +71,7 @@ true true true - _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - release variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - release variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) diff --git a/specs/src/itemTest.vcxproj b/specs/src/itemTest.vcxproj index 8114fbc..e6234ea 100644 --- a/specs/src/itemTest.vcxproj +++ b/specs/src/itemTest.vcxproj @@ -52,7 +52,7 @@ Level3 Disabled true - _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - debug variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - debug variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) @@ -71,7 +71,7 @@ true true true - _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - release variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - release variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) diff --git a/specs/src/readWriteTest.vcxproj b/specs/src/readWriteTest.vcxproj index d9fa306..e13b871 100644 --- a/specs/src/readWriteTest.vcxproj +++ b/specs/src/readWriteTest.vcxproj @@ -52,7 +52,7 @@ Level3 Disabled true - _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - debug variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - debug variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) @@ -71,7 +71,7 @@ true true true - _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - release variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - release variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) diff --git a/specs/src/specs.vcxproj b/specs/src/specs.vcxproj index 89111ff..33a40d0 100644 --- a/specs/src/specs.vcxproj +++ b/specs/src/specs.vcxproj @@ -54,7 +54,7 @@ Level3 Disabled true - _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - debug variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - debug variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) @@ -74,7 +74,7 @@ true true MultiThreaded - _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - release variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - release variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) diff --git a/specs/src/timeTest.vcxproj b/specs/src/timeTest.vcxproj index ec5cdb0..3c8f77f 100644 --- a/specs/src/timeTest.vcxproj +++ b/specs/src/timeTest.vcxproj @@ -52,7 +52,7 @@ Level3 Disabled true - _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - debug variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;DEBUG;ALU_DUMP;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - debug variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) @@ -71,7 +71,7 @@ true true true - _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;SPECS_NO_PYTHON;PYTHON_FULL_VER=N/A;GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion) - release variation";%(PreprocessorDefinitions) + _CRT_SECURE_NO_WARNINGS;WIN64;ALURAND_wincrypt;PUT_TIME__SUPPORTED;ADVANCED_REGEX_FUNCTIONS;$(SpecsPythonDefs)GITTAG="$(GitTag)";LITERAL_PLATFORM="Windows (x64) system using MSBuild $(MSBuildVersion)$(SpecsPythonPlatformNote) - release variation";%(PreprocessorDefinitions) true stdcpp17 $(ProjectDir);%(AdditionalIncludeDirectories) From fbc3400e19d03539606e983fb7212908be17450c Mon Sep 17 00:00:00 2001 From: niry1 Date: Wed, 6 May 2026 13:13:12 +0300 Subject: [PATCH 04/50] Issue #353 - Build with Python in GitHub PR testing --- .github/workflows/c-cpp.yml | 22 ++++++++++++++++++++++ specs/Directory.Build.targets | 15 ++++++++++++++- 2 files changed, 36 insertions(+), 1 deletion(-) diff --git a/.github/workflows/c-cpp.yml b/.github/workflows/c-cpp.yml index 3963344..b5f45f9 100644 --- a/.github/workflows/c-cpp.yml +++ b/.github/workflows/c-cpp.yml @@ -85,3 +85,25 @@ jobs: - name: Test specs executable run: specs/bin/Release/specs.exe "@version" WRITE "@platform" + + build-windows-python: + runs-on: windows-latest + + steps: + - uses: actions/checkout@v4 + with: + fetch-depth: 0 + + - name: Set up Python 3.12 + uses: actions/setup-python@v5 + with: + python-version: '3.12' + + - name: Add MSBuild to PATH + uses: microsoft/setup-msbuild@v2 + + - name: Build specs with Python (Release) + run: msbuild specs/specs.sln /p:Configuration=Release /p:Platform=x64 /p:EnablePython=true + + - name: Test specs executable + run: specs/bin/Release/specs.exe "@version" WRITE "@platform" diff --git a/specs/Directory.Build.targets b/specs/Directory.Build.targets index dc96fbf..ba05254 100644 --- a/specs/Directory.Build.targets +++ b/specs/Directory.Build.targets @@ -24,10 +24,15 @@ + Condition="'$(PythonDir)' == ''" + IgnoreExitCode="true"> + + + + + + + + + From 9d8591aa2e2159023f00a6803d52dd984703f622 Mon Sep 17 00:00:00 2001 From: niry1 Date: Wed, 6 May 2026 13:47:34 +0300 Subject: [PATCH 05/50] Issue #353 - Add Python variations to the package artifacts --- .github/packaging/specs-python.wxs.in | 29 +++++++++ .github/workflows/release.yml | 90 ++++++++++++++++++++++++++- BUILDING.md | 2 +- README.md | 10 +-- 4 files changed, 125 insertions(+), 6 deletions(-) create mode 100644 .github/packaging/specs-python.wxs.in diff --git a/.github/packaging/specs-python.wxs.in b/.github/packaging/specs-python.wxs.in new file mode 100644 index 0000000..1cbf1c1 --- /dev/null +++ b/.github/packaging/specs-python.wxs.in @@ -0,0 +1,29 @@ + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index dca1c24..1cbeff8 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -238,6 +238,92 @@ jobs: name: windows-exe path: specs-*-windows-x64.exe + build-windows-python: + runs-on: windows-latest + steps: + - uses: actions/checkout@v4 + with: + fetch-depth: 0 + + - name: Set up Python 3.12 + uses: actions/setup-python@v5 + with: + python-version: '3.12' + + - name: Add MSBuild to PATH + uses: microsoft/setup-msbuild@v2 + + - name: Normalize Windows version metadata + id: version + shell: bash + run: | + raw_version="${SPECS_VERSION:-${{ github.event.inputs.version }}}" + if [ -z "$raw_version" ]; then + echo "Missing release version" >&2 + exit 1 + fi + + display_version="${raw_version#v}" + numeric_base="${display_version%%[-+]*}" + IFS='.' read -r major minor patch extra <<< "$numeric_base" + major="${major:-0}" + minor="${minor:-0}" + patch="${patch:-0}" + extra="${extra:-0}" + + echo "raw=$raw_version" >> "$GITHUB_OUTPUT" + echo "display=$display_version" >> "$GITHUB_OUTPUT" + echo "file=${major},${minor},${patch},${extra}" >> "$GITHUB_OUTPUT" + echo "msi=${major}.${minor}.${patch}" >> "$GITHUB_OUTPUT" + + - name: Generate Windows version header + shell: bash + run: | + cat > specs/src/specs_version.h < specs-python.wxs + + - name: Build MSI + shell: bash + run: wix build -o specs-${{ steps.version.outputs.display }}-python312.msi specs-python.wxs + + - name: Upload MSI artifact + uses: actions/upload-artifact@v4 + with: + name: windows-msi-python + path: specs-*-python312.msi + + - name: Upload standalone executable artifact + uses: actions/upload-artifact@v4 + with: + name: windows-exe-python + path: specs-*-python312-windows-x64.exe + build-deb: strategy: matrix: @@ -331,7 +417,7 @@ jobs: path: specs_*.deb publish: - needs: [build-linux, build-macos, build-windows, build-deb] + needs: [build-linux, build-macos, build-windows, build-windows-python, build-deb] runs-on: ubuntu-latest permissions: contents: write @@ -353,4 +439,6 @@ jobs: artifacts/linux-deb-arm64/*.deb artifacts/macos-pkg/*.pkg artifacts/windows-msi/*.msi + artifacts/windows-msi-python/*.msi artifacts/windows-exe/*.exe + artifacts/windows-exe-python/*.exe diff --git a/BUILDING.md b/BUILDING.md index a08ffb2..2b33807 100644 --- a/BUILDING.md +++ b/BUILDING.md @@ -93,4 +93,4 @@ Known Issues * Regular expression grammars other than the default `ECMAScript` don't work except on Mac OS. * On Windows with Python support, the appropriate DLL (like `python312.dll`) must be in the PATH. -*Note:* Although Windows for ARM64 is not officially supported, that platform will run the x64 version just fine. +*Note:* Although Windows for ARM64 is not officially supported, that platform will run the x64 version just fine. For Python integration, you'll need to install the x64 version of Python. diff --git a/README.md b/README.md index 05ef008..f0ce286 100644 --- a/README.md +++ b/README.md @@ -13,7 +13,9 @@ News 11-Sep-2026: Version 1.0.0 is here What's new: - * Various TBD improvements + * Support Python in `MSBuild` builds + * Added MSI and stand-alone Windows executable with Python 3.12 support + *** 1-May-2026: Version 0.9.9 is here @@ -42,9 +44,9 @@ Installation from binaries The binaries for the latest release can be downloaded from [**the release page**](https://github.com/yoavnir/specs2016/releases/tag/v0.9.9) Limitations: - * You will not get any Python support for Python integration on Windows - * You may get an older version of Python for Python integration on other platforms - * No support for exotic OS-es like Windows on ARM. + * You may get an older version of Python for Python integration. + * On Windows, you need to have Python 3.12 (exactly!) to get Python integration. + * On Windows for ARM, you need to install the x64 version of Python 3.12. Building ======== From b03091add7c75502ebf7fc0cf1ea9d8d5e001625 Mon Sep 17 00:00:00 2001 From: donglrd <604244493@qq.com> Date: Mon, 11 May 2026 17:26:10 +0800 Subject: [PATCH 06/50] Document Python local functions in manpage --- manpage | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/manpage b/manpage index 445f60f..71891bf 100644 --- a/manpage +++ b/manpage @@ -1771,6 +1771,52 @@ clause. The following two are equivalent: .P specs IF "word(1)=='UU'" + +.SH PYTHON FUNCTIONS +When compiled with Python support, +.I specs +can load user-defined functions from a Python file named +.B localfuncs.py. +The file is searched on +.B SPECSPATH, +the same path used for specification files. +It should be a normal importable Python module. +.PP +Functions whose names do not start with an underscore are made available to +.I specs +expressions. Helper functions can be kept private by starting their names with +an underscore. +.PP +For example, a +.B localfuncs.py +file can define a small formatting function: +.RS +.nf +def commas(value): + value = int(value) + return f"{value:,}" +.fi +.RE +.PP +It can then be called from a specification expression: +.RS +.nf +specs print "commas(word(1))" 1.12 RIGHT +.fi +.RE +.PP +Python functions may return an integer, floating-point number, string, or +.B None. +Docstrings are used by +.B specs --help pyfuncs +when listing loaded Python functions. Use +.B --pythonFuncs on +to force loading Python functions, +.B --pythonFuncs off +to disable them, or +.B --pythonFuncs auto +to load them only after an unknown function is encountered. + .SH OPTIONS .B specs supports the following switches: From 095e376ce1f6c4f47420449929306398f7f67406 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Mon, 11 May 2026 15:56:34 +0300 Subject: [PATCH 07/50] Issue #356 - Set Linux and Mac OS Builds to Python 3.12 (#367) * Issue #356 - Set Linux and Mac OS Builds to Python 3.12 * Issue #356 - use PyConfig API instead of Py_SetPythonHome * Issue #356 - Eliminate static build - depend on Python 3.12 * Issue #356 - Adjust README.md Used Devin in this change. --- .github/workflows/c-cpp.yml | 15 ++++++++--- .github/workflows/release.yml | 50 +++++++++++++++++------------------ README.md | 19 +++++++------ specs/src/setup.py | 2 +- specs/src/utils/PythonIntf.cc | 13 ++++++--- 5 files changed, 57 insertions(+), 42 deletions(-) diff --git a/.github/workflows/c-cpp.yml b/.github/workflows/c-cpp.yml index b5f45f9..8d4c777 100644 --- a/.github/workflows/c-cpp.yml +++ b/.github/workflows/c-cpp.yml @@ -25,12 +25,14 @@ jobs: echo "Event: ${{ github.event_name }}, Ref: ${{ github.ref }}, SPECS_BRANCH: ${SPECS_BRANCH}" echo "$PR_EVENT" - - name: Install Python dev headers - run: sudo apt-get install -y python3-dev + - name: Set up Python 3.12 + uses: actions/setup-python@v5 + with: + python-version: '3.12' - name: configure working-directory: specs/src - run: python3 setup.py --branch ${SPECS_BRANCH} --python python3 + run: python3.12 setup.py --branch ${SPECS_BRANCH} --python python3.12 - name: make working-directory: specs/src @@ -51,9 +53,14 @@ jobs: with: fetch-depth: 0 + - name: Set up Python 3.12 + uses: actions/setup-python@v5 + with: + python-version: '3.12' + - name: configure working-directory: specs/src - run: python3 setup.py -c CLANG --branch ${SPECS_BRANCH} --python python3 + run: python3.12 setup.py -c CLANG --branch ${SPECS_BRANCH} --python python3.12 - name: make working-directory: specs/src diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index 1cbeff8..a161edc 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -17,7 +17,7 @@ jobs: build-linux: runs-on: ubuntu-latest container: - image: ubuntu:20.04 + image: ubuntu:22.04 steps: - uses: actions/checkout@v4 with: @@ -28,14 +28,18 @@ jobs: EVENT: ${{ toJSON(github.event.release) }} run: echo "$EVENT" - - name: Install build tools + - name: Install build tools and Python 3.12 run: | DEBIAN_FRONTEND=noninteractive apt-get update - DEBIAN_FRONTEND=noninteractive apt-get install -y gcc g++ make python3 python3-dev rpm git libexpat1-dev zlib1g-dev dpkg-dev + DEBIAN_FRONTEND=noninteractive apt-get install -y software-properties-common + add-apt-repository -y ppa:deadsnakes/ppa + DEBIAN_FRONTEND=noninteractive apt-get update + DEBIAN_FRONTEND=noninteractive apt-get install -y gcc g++ make python3.12 python3.12-dev rpm git libexpat1-dev zlib1g-dev dpkg-dev - name: Configure working-directory: specs/src - run: python3 setup.py --branch ${SPECS_BRANCH} --python python3 --static + run: | + python3.12 setup.py --branch ${SPECS_BRANCH} --python python3.12 - name: Build working-directory: specs/src @@ -44,15 +48,6 @@ jobs: - name: Verify binary run: specs/exe/specs "@version" WRITE "@platform" - - name: Verify no dynamic libpython dependency - run: | - if ldd specs/exe/specs | grep -q libpython; then - echo "ERROR: specs still dynamically links libpython" - ldd specs/exe/specs | grep libpython - exit 1 - fi - echo "OK: no dynamic libpython dependency" - - name: Prepare manpage run: | cp manpage specs.1 @@ -112,9 +107,14 @@ jobs: with: fetch-depth: 0 + - name: Set up Python 3.12 + uses: actions/setup-python@v5 + with: + python-version: '3.12' + - name: Configure working-directory: specs/src - run: python3 setup.py -c CLANG --branch ${SPECS_BRANCH} --python python3 + run: python3.12 setup.py -c CLANG --branch ${SPECS_BRANCH} --python python3.12 - name: Build working-directory: specs/src @@ -333,6 +333,7 @@ jobs: container: ubuntu:22.04 - arch: arm64 runner: ubuntu-22.04-arm + container: ubuntu:22.04 runs-on: ${{ matrix.runner }} container: ${{ matrix.container || '' }} steps: @@ -340,19 +341,25 @@ jobs: with: fetch-depth: 0 - - name: Install build tools + - name: Install build tools and Python 3.12 run: | if [ "$(id -u)" -eq 0 ]; then APT="apt-get" else APT="sudo apt-get" fi + # Set timezone non-interactively to avoid tzdata prompts + ln -fs /usr/share/zoneinfo/UTC /etc/localtime $APT update - $APT install -y gcc g++ make python3 python3-dev dpkg-dev git libexpat1-dev zlib1g-dev + $APT install -y software-properties-common + add-apt-repository -y ppa:deadsnakes/ppa + $APT update + $APT install -y gcc g++ make python3.12 python3.12-dev dpkg-dev git libexpat1-dev zlib1g-dev - name: Configure working-directory: specs/src - run: python3 setup.py --branch ${SPECS_BRANCH} --python python3 --static + run: | + python3.12 setup.py --branch ${SPECS_BRANCH} --python python3.12 - name: Build working-directory: specs/src @@ -361,15 +368,6 @@ jobs: - name: Verify binary run: specs/exe/specs "@version" WRITE "@platform" - - name: Verify no dynamic libpython dependency - run: | - if ldd specs/exe/specs | grep -q libpython; then - echo "ERROR: specs still dynamically links libpython" - ldd specs/exe/specs | grep libpython - exit 1 - fi - echo "OK: no dynamic libpython dependency" - - name: Prepare manpage run: | cp manpage specs.1 diff --git a/README.md b/README.md index f0ce286..b4480e2 100644 --- a/README.md +++ b/README.md @@ -13,6 +13,7 @@ News 11-Sep-2026: Version 1.0.0 is here What's new: + * All pre-built binaries now work with Python 3.12 * Support Python in `MSBuild` builds * Added MSI and stand-alone Windows executable with Python 3.12 support @@ -31,8 +32,6 @@ What's new: *Note:* Installing from package does not include Python support on Windows. -*Note:* On Linux, the `specs` binary is bigger when installed from package, as it is statically linked with libstdc++. - Sources ======= To download your copy of *specs*, you can get it from [github](https://github.com/yoavnir/specs2016) in either of two ways: @@ -41,21 +40,25 @@ To download your copy of *specs*, you can get it from [github](https://github.co Installation from binaries ========================== -The binaries for the latest release can be downloaded from [**the release page**](https://github.com/yoavnir/specs2016/releases/tag/v0.9.9) +The binaries for the latest release can be downloaded from [**the release page**](https://github.com/yoavnir/specs2016/releases/tag/v1.0.0) + +**Requirements:** + * **Python 3.12 must be installed on your target machine.** All pre-built binaries (Linux RPM, Linux DEB, macOS .pkg, and Windows MSI/executable) are dynamically linked against Python 3.12. -Limitations: - * You may get an older version of Python for Python integration. - * On Windows, you need to have Python 3.12 (exactly!) to get Python integration. - * On Windows for ARM, you need to install the x64 version of Python 3.12. +**Notes:** + * On Windows for ARM, you may install the x64 version of Python 3.12. + * Recent Mac OS versions are very strict on where packages come from. You may need to issue the following command to get the .pkg file to install: `xattr -dr com.apple.quarantine /path/to/specs-1.0.0.pkg` Building ======== For detailed build instructions covering Linux, Mac OS, and Windows (both `make` and MSBuild), see [BUILDING.md](BUILDING.md). +**Note on Python versions:** The pre-built binaries are linked against Python 3.12. If you need to use a different version of Python, or if Python 3.12 is not available on your target platform, you must build `specs` locally from source. When building, you can specify which Python version to use via the `--python` option to `setup.py` (on Linux/macOS) or by setting the appropriate Python version in your Visual Studio environment (on Windows). + Known Issues ============ * Regular expression grammars other than the default `ECMAScript` don't work except on Mac OS. -* On Windows with Python support the appropriate dll (like `python38.dll`) must be in the path. +* On Windows with Python support, `python312.dll` must be in the path (or Python 3.12 must be installed). Contributing ============ diff --git a/specs/src/setup.py b/specs/src/setup.py index 9d0abaf..e76c132 100644 --- a/specs/src/setup.py +++ b/specs/src/setup.py @@ -719,7 +719,7 @@ def python_search(arg): # The static libpython archive includes built-in extension modules # (pyexpat, zlib, etc.) that depend on these system libraries. static_pyldflags.extend(["-lexpat", "-lz"]) - # Ubuntu 20.04's libpython3.8.a is not PIE-compatible, so disable PIE + # Older libpython static archives may not be PIE-compatible, so disable PIE static_pyldflags.append("-no-pie") condlink = condlink + " " + " ".join(static_pyldflags) else: diff --git a/specs/src/utils/PythonIntf.cc b/specs/src/utils/PythonIntf.cc index e4b7c34..3f5e752 100644 --- a/specs/src/utils/PythonIntf.cc +++ b/specs/src/utils/PythonIntf.cc @@ -284,10 +284,17 @@ class PythonFunctionCollection : public ExternalFunctionCollection { } // Initialize Python environment #ifdef PYTHON_STDLIB_PATH - // When Python is statically linked, set the home directory to our bundled stdlib - Py_SetPythonHome(Py_DecodeLocale(PYTHON_STDLIB_PATH, NULL)); -#endif + // When Python is statically linked, use PyConfig to set the home + // directory to our bundled stdlib (Py_SetPythonHome was deprecated + // in Python 3.11). + PyConfig config; + PyConfig_InitPythonConfig(&config); + PyConfig_SetBytesString(&config, &config.home, PYTHON_STDLIB_PATH); + Py_InitializeFromConfig(&config); + PyConfig_Clear(&config); +#else Py_Initialize(); +#endif // update the python path if (_path && _path[0]) { From 736eef80891da9613422918f4eb97c829fcb17a6 Mon Sep 17 00:00:00 2001 From: Miriam Date: Tue, 12 May 2026 08:06:31 +0300 Subject: [PATCH 08/50] feat: display Python error details by default (#372) Replace error handling with exit on failure to load local functions --- specs/src/utils/PythonIntf.cc | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/specs/src/utils/PythonIntf.cc b/specs/src/utils/PythonIntf.cc index 3f5e752..52faa75 100644 --- a/specs/src/utils/PythonIntf.cc +++ b/specs/src/utils/PythonIntf.cc @@ -335,11 +335,9 @@ class PythonFunctionCollection : public ExternalFunctionCollection { m_Initialized = true; return; } else { - if (g_bVerbose) { - std::cerr << "Python Interface: Error loading local functions: "; - PyErr_Print(); - } - MYTHROW("Error loading local functions"); + std::cerr << "Python Interface: Error loading local functions: " << std::endl; + PyErr_Print(); + exit(0); } } From 8df4fde8633f6fcd0a81460dc135d9316327e5c5 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Tue, 12 May 2026 15:50:47 +0300 Subject: [PATCH 09/50] Improve error message for Python func returning invalid value type (#375) --- specs/src/utils/PythonIntf.cc | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/specs/src/utils/PythonIntf.cc b/specs/src/utils/PythonIntf.cc index 52faa75..e554616 100644 --- a/specs/src/utils/PythonIntf.cc +++ b/specs/src/utils/PythonIntf.cc @@ -182,15 +182,20 @@ class PythonFuncRec : public ExternalFunctionRec { pRet = mkValue(std::string("")); } } else if (PyString_Check(pResult)) { - pRet = mkValue(PyString_AS_STRING(pResult)); + pRet = mkValue(PyUnicode_AsUTF8(pResult)); } else if (Py_None == pResult){ pRet = mkValue0(); // NaN } else { - PyObject* pRepr = PyObject_Repr(pResult); - std::string err = "Invalid return type from function "; - err += m_name + ": "; - err += PyString_AS_STRING(pRepr); - Py_DECREF(pRepr); + std::string err = "Invalid return type <"; + err += Py_TYPE(pResult)->tp_name; + err += "> from function "; + err += m_name; + if (g_bVerbose) { + PyObject* pRepr = PyObject_Repr(pResult); + err += " with content "; + err += PyUnicode_AsUTF8(pRepr); + Py_DECREF(pRepr); + } Py_DECREF(pResult); MYTHROW(err); } From 27b2b713451db2f3f6c40a49ce584e0c3a83e998 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Sun, 17 May 2026 09:33:08 +0300 Subject: [PATCH 10/50] Issue #209 - GDB debugging aids (#373) --- .gitignore | 4 + README.md | 3 +- agents.md | 1 + specs/docs/TOC.md | 1 + specs/docs/debugging.md | 501 ++++++++++++ specs/src/.gdbinit | 5 + specs/src/gdb/COMMANDS.md | 152 ++++ specs/src/gdb/specs.gdb | 282 +++++++ specs/src/gdb/specs_gdb.py | 1481 ++++++++++++++++++++++++++++++++++++ 9 files changed, 2429 insertions(+), 1 deletion(-) create mode 100644 specs/docs/debugging.md create mode 100644 specs/src/.gdbinit create mode 100644 specs/src/gdb/COMMANDS.md create mode 100644 specs/src/gdb/specs.gdb create mode 100644 specs/src/gdb/specs_gdb.py diff --git a/.gitignore b/.gitignore index 822d12c..fa0913d 100644 --- a/.gitignore +++ b/.gitignore @@ -44,3 +44,7 @@ specs/src/Release/ # Mac OS stuff .DS_Store + +# GDB-related +specs/src/.gdbinit +specs/src/gdb/__pycache__ diff --git a/README.md b/README.md index b4480e2..fb65923 100644 --- a/README.md +++ b/README.md @@ -15,7 +15,8 @@ News What's new: * All pre-built binaries now work with Python 3.12 * Support Python in `MSBuild` builds - * Added MSI and stand-alone Windows executable with Python 3.12 support + * Added MSI and stand-alone Windows executable to release artifacts + * Debugging aids for GDB *** 1-May-2026: Version 0.9.9 is here diff --git a/agents.md b/agents.md index 8dadfdf..2824abb 100644 --- a/agents.md +++ b/agents.md @@ -6,3 +6,4 @@ - Keep any new `ProcessingTest.cc` regression cases appended at the end when possible, to avoid unnecessary renumbering. - `MYASSERT(cond)` and `MYASSERT_WITH_MSG(cond, msg)` (defined in `specs/src/utils/ErrorReporting.h`) are **always-on runtime checks** that throw `SpecsException` via `MYTHROW`. They are *not* compiled out by `NDEBUG`. Do not add redundant `if`/`MYTHROW` guards that duplicate what a `MYASSERT` already covers. - When building the project, always use the command-line `make clean all`. Do not skip the `clean` target, and do not use the `-j` argument. +- When changing the structure of any of the classes that have dump_ macros in `specs_gdb.py` and `specs.gdb` update the relevant macros as well. diff --git a/specs/docs/TOC.md b/specs/docs/TOC.md index bf9cd64..b6e6d58 100644 --- a/specs/docs/TOC.md +++ b/specs/docs/TOC.md @@ -40,6 +40,7 @@ Basic Functionality Advanced Topics =============== +* [Debugging specs with GDB](debugging.md) * [Command-Line Switches](cliswitch.md) * [Advanced ALU](alu_adv.md) * [Table of Operands](alu_adv.md#table-of-operands) diff --git a/specs/docs/debugging.md b/specs/docs/debugging.md new file mode 100644 index 0000000..e2dcb1b --- /dev/null +++ b/specs/docs/debugging.md @@ -0,0 +1,501 @@ +# Using GDB to Debug specs + +This guide explains how to use the GDB debugging toolkit for specs, which provides convenient "dump" commands for inspecting all major classes and data structures during debugging. + +## Table of Contents + +1. [Building with Debug Symbols](#building-with-debug-symbols) +2. [Loading the GDB Macros](#loading-the-gdb-macros) +3. [Quick Reference](#quick-reference) +4. [Detailed Examples](#detailed-examples) +5. [Tips and Tricks](#tips-and-tricks) +6. [Troubleshooting](#troubleshooting) + +--- + +## Building with Debug Symbols + +To debug specs effectively, you must build with debug symbols enabled. + +### On Linux and macOS + +```bash +cd specs/src +python setup.py -v DEBUG +make clean all +``` + +The `-v DEBUG` flag tells the setup script to enable debug symbols and disable optimizations, making it easier to inspect variables and step through code. + +### On Windows + +```cmd +cd specs\src +python setup.py -v DEBUG -c VS +msbuild specs\specs.sln /p:Configuration=Debug /p:Platform=x64 +``` + +--- + +## Loading the GDB Macros + +### Automatic Loading (Recommended) + +When you run GDB from the `specs/src/` directory, the `.gdbinit` file is automatically loaded: + +```bash +cd specs/src +gdb ./specs +``` + +GDB will automatically source `gdb/specs.gdb`, which loads the Python extension and registers all dump commands. + +### Manual Loading + +If you're running GDB from a different directory, you can manually load the macros: + +```bash +gdb ./specs -x specs/src/gdb/specs.gdb +``` + +Or from within GDB: + +``` +(gdb) source specs/src/gdb/specs.gdb +``` + +### Verify Loading + +After loading, you should see a welcome message: + +``` +======================================== +specs GDB debugging macros loaded +======================================== + +Available commands: + dump_pstate - Dump ProcessingState + dump_sb - Dump StringBuilder + dump_item - Dump Item (polymorphic) + ... +``` + +--- + +## Quick Reference + +### ProcessingState Commands + +| Command | Purpose | +|---------|---------| +| `dump_pstate ` | Dump the entire ProcessingState (current record, cycle counter, separators, etc.) | +| `dump_sb ` | Dump the StringBuilder (output string being built, current position) | + +### Item Commands + +| Command | Purpose | +|---------|---------| +| `dump_item ` | Dump an Item (polymorphic; detects DataField, TokenItem, etc.) | +| `dump_items ` | Dump an itemGroup (list of all compiled spec items) | +| `dump_data_field ` | Dump a DataField (input source, output placement, conversion) | +| `dump_token_item ` | Dump a TokenItem | +| `dump_set_item ` | Dump a SetItem (assignment expression) | +| `dump_condition_item ` | Dump a ConditionItem (IF/WHILE/etc.) | +| `dump_split_item ` | Dump a SplitItem (SPLITW/SPLITF) | + +### InputPart Commands + +| Command | Purpose | +|---------|---------| +| `dump_literal_part ` | Dump a LiteralPart (literal string) | +| `dump_range_part ` | Dump a RangePart (character range) | +| `dump_word_range_part ` | Dump a WordRangePart (word range with separator) | +| `dump_field_range_part ` | Dump a FieldRangePart (field range with separator) | +| `dump_clock_part ` | Dump a ClockPart (time value) | +| `dump_id_part ` | Dump an IDPart (field identifier) | +| `dump_expr_part ` | Dump an ExpressionPart (ALU expression) | + +### Token Commands + +| Command | Purpose | +|---------|---------| +| `dump_token ` | Dump a Token (type, literal, original text) | +| `dump_token_range ` | Dump a TokenFieldRange (range specification) | + +### ALU Commands + +| Command | Purpose | +|---------|---------| +| `dump_alu_value ` | Dump an ALUValue (type, value, exactness) | +| `dump_alu_counters ` | Dump ALUCounters (all counter variables) | +| `dump_alu_unit ` | Dump an AluUnit (polymorphic; literal, counter, operator, etc.) | +| `dump_alu_vec ` | Dump an AluVec (vector of AluUnits) | +| `dump_alu_stats ` | Dump AluValueStats (statistical data) | +| `dump_freq_map ` | Dump a frequencyMap (frequency distribution) | + +### Utility Commands + +| Command | Purpose | +|---------|---------| +| `dump_reader ` | Dump a Reader (record count, EOF state) | +| `dump_writer ` | Dump a Writer (output count) | +| `dump_exception ` | Dump a SpecsException (file, line, message) | + +### Python Interface Commands + +|| Command | Purpose | +||---------|----------| +|| `dump_alu_function ` | Dump an AluFunction (name, arg count, input dependency) | +|| `dump_external_func_rec ` | Dump an ExternalFunctionRec (polymorphic base class) | +|| `dump_external_func_collection ` | Dump an ExternalFunctionCollection (initialization state) | +|| `dump_python_func_collection ` | Dump a PythonFunctionCollection (internal Python function registry) | +|| `dump_python_func_rec ` | Dump a PythonFuncRec (Python function record with name and args) | +|| `dump_python_func_arg ` | Dump a PythonFuncArg (function argument with default value) | + +### Breakpoint Helpers + +| Command | Purpose | +|---------|---------| +| `bp_apply` | Set breakpoint on Item::apply | +| `bp_getstr` | Set breakpoint on InputPart::getStr | +| `bp_compile` | Set breakpoint on itemGroup::Compile | + +--- + +## Detailed Examples + +### Example 1: Inspecting ProcessingState During Execution + +Suppose you're debugging a spec that processes records and you want to see the current state: + +``` +(gdb) break Item::apply +Breakpoint 1 at 0x... + +(gdb) run < input.txt +Starting program: ./specs ... +Breakpoint 1, Item::apply (this=0x..., pState=0x..., pSB=0x...) at specitems/specItems.cc:... + +(gdb) dump_pstate pState +ProcessingState @ 0x7fffffffde00 + Current Record: "hello world" + Previous Record: "goodbye world" + Pad Char: ' ' (0x20) + Word Separator: " " + Field Separator: "\t" + Cycle Counter: 42 + Extra Reads: 0 + Record Count: 42 + Word Count: 2 + Field Count: 1 + Input Station: -1 + Input Stream: 1 + Output Index: 1 + No Write: false + EOF: false +``` + +This shows you exactly what the current record is, how many times we've processed records, and the current separators. + +### Example 2: Inspecting a DataField + +When debugging a data field specification: + +``` +(gdb) dump_data_field pDataField +DataField @ 0x... + m_label: A + m_outStart: 10 + m_maxLength: 20 + m_strip: true + m_conversion: UCASE + m_alignment: Left +``` + +This tells you that the field is labeled 'A', outputs starting at column 10, has a max length of 20 characters, strips whitespace, converts to uppercase, and is left-aligned. + +### Example 3: Walking an itemGroup + +To see the entire compiled specification: + +``` +(gdb) dump_items pItemGroup +itemGroup @ 0x... + bNeedRunoutCycle: true + bFoundSelectSecond: false + Item count: 5 + Items: + [0] @ 0x... + [1] @ 0x... + [2] @ 0x... + [3] @ 0x... + [4] @ 0x... +``` + +Then you can inspect individual items: + +``` +(gdb) dump_item pItemGroup.m_items[0] +Item @ 0x... + m_originalIndex: 0 + Debug: {Source=Range[1:10];Dest=@10L20} + readsLines: true + producesOutput: true + forcesRunoutCycle: false + isBreak: false +``` + +### Example 4: Examining ALU Expressions + +When debugging expression evaluation: + +``` +(gdb) dump_alu_value myALUValue +ALUValue @ 0x... + m_type: Int + m_value: "42" + m_exact: true + +(gdb) dump_alu_counters g_counters +ALUCounters @ 0x... + Counters (map): + m_map @ 0x... +``` + +### Example 5: Conditional Breakpoints with Cycle Counter + +To break only on a specific record number: + +``` +(gdb) break Item::apply if pState.m_CycleCounter == 100 +Breakpoint 1 at 0x... + +(gdb) run < input.txt +... +Breakpoint 1, Item::apply (this=0x..., pState=0x..., pSB=0x...) at specitems/specItems.cc:... + +(gdb) dump_pstate pState +ProcessingState @ 0x... + Cycle Counter: 100 + ... +``` + +This is useful for debugging issues that only occur on specific records. + +### Example 6: Debugging Python Function Integration + +When debugging Python function calls and integration: + +``` +(gdb) break PythonIntf.cc:167 +Breakpoint 1 at 0x... + +(gdb) run -f myspec.txt < input.txt +... +Breakpoint 1, PyObject_CallObject (...) at PythonIntf.cc:167 + +(gdb) dump_python_func_collection g_PythonFunctions +PythonFunctionCollection @ 0x... + m_Initialized: true + m_Functions @ 0x... + +(gdb) dump_python_func_rec g_PythonFunctions.m_Functions[0] +PythonFuncRec @ 0x... + m_name: my_custom_function + m_pFuncPtr: 0x... + m_doc: Computes the custom value based on input + m_pTuple: 0x... + m_args (2 items): + [0] input_value (default: counterType__Int) + = 0 + [1] multiplier (default: counterType__Float) + = 1.5 +``` + +This shows you the complete function signature, documentation, and argument defaults. The `m_pTuple` field shows whether arguments have been prepared for the function call. + +--- + +## Tips and Tricks + +### 1. Using Pretty-Printers + +The GDB extension includes pretty-printers for common types. When you print a variable, it's automatically formatted nicely: + +``` +(gdb) p myALUValue +$1 = ALUValue {type: Int, value: "42", exact: true} + +(gdb) p myToken +$2 = Token {type: RANGE, literal: "", argc: 0} +``` + +### 2. Inspecting Shared Pointers + +The extension can dereference `std::shared_ptr` automatically: + +``` +(gdb) p myDataField.m_InputPart +$3 = std::shared_ptr (use count=2, weak count=0) 0x... + +(gdb) dump_input_part myDataField.m_InputPart +InputPart @ 0x... + Debug: Range[1:10] + readsLines: true + forcesRunoutCycle: false +``` + +### 3. Setting Breakpoints on Virtual Methods + +Since specs uses polymorphism extensively, you can break on virtual methods: + +``` +(gdb) break InputPart::getStr +Breakpoint 1 at 0x... + +(gdb) run < input.txt +... +Breakpoint 1, LiteralPart::getStr (this=0x..., pState=...) at specitems/InputPart.cc:... +``` + +GDB will break on any derived class's implementation. + +### 4. Examining the Specification Before Execution + +You can set a breakpoint at the start of processing and dump the entire compiled spec: + +``` +(gdb) break itemGroup::process +Breakpoint 1 at 0x... + +(gdb) run < input.txt +... +Breakpoint 1, itemGroup::process (this=0x..., sb=..., pState=..., rd=..., tmr=...) at specitems/specItems.cc:... + +(gdb) dump_items this +itemGroup @ 0x... + Item count: 5 + Items: + [0] @ 0x... + [1] @ 0x... + ... +``` + +### 5. Tracking State Changes + +Use conditional breakpoints to track when state changes: + +``` +(gdb) break ProcessingState::setString +Breakpoint 1 at 0x... + +(gdb) commands +> dump_pstate this +> continue +> end + +(gdb) run < input.txt +ProcessingState @ 0x... + Current Record: "line 1" + Cycle Counter: 1 +ProcessingState @ 0x... + Current Record: "line 2" + Cycle Counter: 2 +... +``` + +--- + +## Troubleshooting + +### Python Extension Not Loading + +**Error:** `ImportError: No module named specs_gdb` + +**Solution:** Make sure you're running GDB from the `specs/src/` directory, or manually source the `.gdb` file: + +```bash +cd specs/src +gdb ./specs +``` + +Or: + +```bash +gdb ./specs -x specs/src/gdb/specs.gdb +``` + +### Dump Commands Not Found + +**Error:** `Undefined command: "dump_pstate"` + +**Solution:** The Python extension may not have loaded. Check that the `.gdb` file was sourced: + +``` +(gdb) source specs/src/gdb/specs.gdb +``` + +### Unable to Read Variables + +**Error:** `Error: ` + +**Solution:** This usually means the inferior (the running program) is in a bad state or the variable is uninitialized. Try: + +1. Step to a different location in the code +2. Check that the variable is actually in scope +3. Use `info locals` to see available variables + +### Calling Methods Fails + +**Error:** `Error: ` + +**Solution:** Some methods may not be callable if the inferior is corrupted or in an inconsistent state. This is normal. The dump commands will still show the raw member variables. + +### GDB Crashes When Calling Methods + +**Solution:** If calling virtual methods causes GDB to crash, you can disable method invocation by editing `specs_gdb.py` and commenting out the `call_method_safe` calls. + +--- + +## Building and Debugging Tips + +### Debugging a Specific Spec + +Create a test input file and a spec file, then run: + +```bash +cd specs/src +gdb ./specs +(gdb) set args -f myspec.txt < input.txt +(gdb) break itemGroup::Compile +(gdb) run +(gdb) dump_items this +``` + +### Debugging Parsing + +To debug specification parsing: + +```bash +(gdb) break itemGroup::Compile +(gdb) run -f myspec.txt < input.txt +(gdb) dump_items this +``` + +### Debugging Expression Evaluation + +To debug ALU expression evaluation: + +```bash +(gdb) break AluFunction::evaluate +(gdb) run -f myspec.txt < input.txt +(gdb) dump_alu_unit this +``` + +--- + +## See Also + +- [specs User Manual](basicspec.md) +- [ALU Reference](alu.md) +- [GDB Manual](https://sourceware.org/gdb/documentation/) diff --git a/specs/src/.gdbinit b/specs/src/.gdbinit new file mode 100644 index 0000000..19a23ec --- /dev/null +++ b/specs/src/.gdbinit @@ -0,0 +1,5 @@ +# specs GDB initialization file +# This file is automatically loaded by GDB when you run it from specs/src/ + +# Source the convenience macros and Python extension +source gdb/specs.gdb diff --git a/specs/src/gdb/COMMANDS.md b/specs/src/gdb/COMMANDS.md new file mode 100644 index 0000000..2c85084 --- /dev/null +++ b/specs/src/gdb/COMMANDS.md @@ -0,0 +1,152 @@ +# GDB Dump Commands Reference + +## InputPart Hierarchy Commands + +| Command | Alias | Description | +|---------|-------|-------------| +| `dump-input-part` | — | Dump an InputPart (polymorphic) | +| `dump-literal-part` | — | Dump a LiteralPart | +| `dump-range-part` | — | Dump a RangePart | +| `dump-word-range-part` | — | Dump a WordRangePart | +| `dump-field-range-part` | — | Dump a FieldRangePart | +| `dump-clock-part` | — | Dump a ClockPart | +| `dump-id-part` | — | Dump an IDPart | +| `dump-expression-part` | `dump_expr_part` | Dump an ExpressionPart | + +## Item Hierarchy Commands + +| Command | Alias | Description | +|---------|-------|-------------| +| `dump-item` | `dump_item` | Dump an Item (polymorphic) | +| `dump-data-field` | `dump_data_field` | Dump a DataField | +| `dump-token-item` | `dump_token_item` | Dump a TokenItem | +| `dump-set-item` | `dump_set_item` | Dump a SetItem | +| `dump-skip-item` | `dump_skip_item` | Dump a SkipItem | +| `dump-condition-item` | `dump_condition_item` | Dump a ConditionItem | +| `dump-break-item` | `dump_break_item` | Dump a BreakItem | +| `dump-select-item` | `dump_select_item` | Dump a SelectItem | +| `dump-split-item` | `dump_split_item` | Dump a SplitItem | + +## itemGroup Command + +| Command | Alias | Description | +|---------|-------|-------------| +| `dump-item-group` | `dump_items` | Dump an itemGroup | + +## Token System Commands + +| Command | Alias | Description | +|---------|-------|-------------| +| `dump-token` | `dump_token` | Dump a Token | +| `dump-token-range` | `dump_token_range` | Dump a TokenFieldRange | + +## Processing Commands + +| Command | Alias | Description | +|---------|-------|-------------| +| `dump-processing-state` | `dump_pstate` | Dump a ProcessingState | +| `dump-string-builder` | `dump_sb` | Dump a StringBuilder | +| `dump-reader` | `dump_reader` | Dump a Reader | +| `dump-writer` | `dump_writer` | Dump a Writer | + +## ALU Commands + +| Command | Alias | Description | +|---------|-------|-------------| +| `dump-alu-value` | `dump_alu_value` | Dump an ALUValue | +| `dump-alu-counters` | `dump_alu_counters` | Dump ALUCounters | +| `dump-alu-unit` | `dump_alu_unit` | Dump an AluUnit (polymorphic) | +| `dump-alu-vec` | `dump_alu_vec` | Dump an AluVec | +| `dump-alu-value-stats` | `dump_alu_stats` | Dump AluValueStats | +| `dump-frequency-map` | `dump_freq_map` | Dump a frequencyMap | + + +## Python Interface Commands + +|| Command | Alias | Description | +||---------|-------|-------------| +|| `dump-alu-function` | `dump_alu_function` | Dump an AluFunction (name, arg count, input dependency) | +|| `dump-external-function-rec` | `dump_external_func_rec` | Dump an ExternalFunctionRec (calls virtual methods GetArgCount/GetFuncPtr) | +|| `dump-external-function-collection` | `dump_external_func_collection` | Dump an ExternalFunctionCollection (initialization state) | +|| `dump-python-function-collection` | `dump_python_func_collection` | Dump a PythonFunctionCollection (registry state and function count) | +|| `dump-python-func-rec` | `dump_python_func_rec` | Dump a PythonFuncRec (name, pointer, doc, and expanded argument list) | +|| `dump-python-func-arg` | `dump_python_func_arg` | Dump a PythonFuncArg (name, default type, and default value) | + +## Utility Commands + +|| Command | Alias | Description | +||---------|-------|-------------| +|| `dump-exception` | `dump_exception` | Dump a SpecsException | +|| `dump-all` | — | Dump all relevant debugging info | + +## Breakpoint Helpers + +|| Command | Description | +||---------|-------------| +|| `bp_apply` | Set breakpoint on Item::apply | +|| `bp_getstr` | Set breakpoint on InputPart::getStr | +|| `bp_compile` | Set breakpoint on itemGroup::Compile | +|| `bp_parseAluExpression` | Set breakpoint on parseAluExpression, where expressions are parsed | +|| `bp_pfc_initialize` | Set breakpoint on PythonFunctionCollection::Initialize, where the Python Function Collection is initialized | +|| `bp_func_setargvalue` | Set breakpoint on PythonFuncRec::setArgValue, where an argument for an external function is set | +|| `bp_func_call` | Set breakpoint on PythonFuncRec::Call, where an external function is invoked | + +## Usage Examples + +### Dump ProcessingState +```gdb +(gdb) dump_pstate pState +ProcessingState @ 0x7fffffffde00 + Current Record: "hello world" + Previous Record: "goodbye world" + Pad Char: ' ' (0x20) + Word Separator: " " + Field Separator: "\t" + Cycle Counter: 42 + ... +``` + +### Dump Item +```gdb +(gdb) dump_item myItem +Item @ 0x... + m_originalIndex: 0 + Debug: {Source=Range[1:10];Dest=@10L20} + readsLines: true + producesOutput: true + ... +``` + +### Dump ALUValue +```gdb +(gdb) dump_alu_value myValue +ALUValue @ 0x... + m_type: Int + m_value: "42" + m_exact: true +``` + +### Set Breakpoint on Item::apply +```gdb +(gdb) bp_apply +Breakpoint 1 at 0x... +(gdb) run +Breakpoint 1, Item::apply (this=0x..., pState=0x..., pSB=0x...) at specitems/specItems.cc:... +(gdb) dump_pstate pState +``` + +## Getting Help + +To get help on any command: +```gdb +(gdb) help dump-processing-state +(gdb) help dump-item +(gdb) help dump_pstate +``` + +## Notes + +- Commands with aliases can be called either way: `dump-processing-state` or `dump_pstate` +- Polymorphic commands (marked with "(polymorphic)") automatically detect the actual derived type +- All dump commands are safe to call even if the inferior is in a bad state +- Virtual method calls (like `Debug()`) are guarded with error handling diff --git a/specs/src/gdb/specs.gdb b/specs/src/gdb/specs.gdb new file mode 100644 index 0000000..36fc163 --- /dev/null +++ b/specs/src/gdb/specs.gdb @@ -0,0 +1,282 @@ +# specs GDB convenience macros and initialization +# This file sources the Python GDB extension and provides shorthand commands + +# Source the Python extension +python +import sys, os +# Try to find specs_gdb.py in common locations +search_paths = [ + os.path.join(os.getcwd(), 'gdb'), # Current dir + gdb/ + os.path.join(os.getcwd(), 'specs', 'src', 'gdb'), # specs/src/gdb from project root + os.path.join(os.getcwd(), '..', 'gdb'), # Parent dir + gdb/ + 'gdb', # Just gdb/ in current dir +] + +gdb_script_dir = None +for path in search_paths: + if os.path.isfile(os.path.join(path, 'specs_gdb.py')): + gdb_script_dir = path + break + +if gdb_script_dir: + sys.path.insert(0, gdb_script_dir) + import specs_gdb +else: + print("Warning: Could not find specs_gdb.py in expected locations") + print("Searched: " + ", ".join(search_paths)) +end + +# ============================================================================ +# CONVENIENCE ALIASES FOR DUMP COMMANDS +# ============================================================================ + +# InputPart hierarchy +define dump_literal_part + dump-literal-part $arg0 +end + +define dump_range_part + dump-range-part $arg0 +end + +define dump_word_range_part + dump-word-range-part $arg0 +end + +define dump_field_range_part + dump-field-range-part $arg0 +end + +define dump_clock_part + dump-clock-part $arg0 +end + +define dump_id_part + dump-id-part $arg0 +end + +define dump_expr_part + dump-expression-part $arg0 +end + +# Item hierarchy +define dump_item + dump-item $arg0 +end + +define dump_data_field + dump-data-field $arg0 +end + +define dump_token_item + dump-token-item $arg0 +end + +define dump_set_item + dump-set-item $arg0 +end + +define dump_skip_item + dump-skip-item $arg0 +end + +define dump_condition_item + dump-condition-item $arg0 +end + +define dump_break_item + dump-break-item $arg0 +end + +define dump_select_item + dump-select-item $arg0 +end + +define dump_split_item + dump-split-item $arg0 +end + +# itemGroup +define dump_items + dump-item-group $arg0 +end + +# Token system +define dump_token + dump-token $arg0 +end + +define dump_token_range + dump-token-range $arg0 +end + +# Processing +define dump_pstate + dump-processing-state $arg0 +end + +define dump_sb + dump-string-builder $arg0 +end + +define dump_reader + dump-reader $arg0 +end + +define dump_writer + dump-writer $arg0 +end + +# ALU +define dump_alu_value + dump-alu-value $arg0 +end + +define dump_alu_counters + dump-alu-counters $arg0 +end + +define dump_alu_unit + dump-alu-unit $arg0 +end + +define dump_alu_vec + dump-alu-vec $arg0 +end + +define dump_alu_stats + dump-alu-value-stats $arg0 +end + +define dump_freq_map + dump-frequency-map $arg0 +end + +# Python interface +define dump_alu_function + dump-alu-function $arg0 +end + +define dump_external_func_rec + dump-external-function-rec $arg0 +end + +define dump_external_func_collection + dump-external-function-collection $arg0 +end + +define dump_python_func_collection + dump-python-function-collection $arg0 +end + +define dump_python_func_rec + dump-python-func-rec $arg0 +end + +define dump_python_func_arg + dump-python-func-arg $arg0 +end + +# Utilities +define dump_exception + dump-exception $arg0 +end + +# ============================================================================ +# USEFUL BREAKPOINT HELPERS +# ============================================================================ + +define bp_apply + break Item::apply +end +document bp_apply + Set a breakpoint on Item::apply to debug item application. +end + +define bp_getstr + break InputPart::getStr +end +document bp_getstr + Set a breakpoint on InputPart::getStr to debug input part string extraction. +end + +define bp_compile + break itemGroup::Compile +end +document bp_compile + Set a breakpoint on itemGroup::Compile to debug specification compilation. +end + +define bp_parseAluExpression + break parseAluExpression +end +document bp_parseAluExpression + Set a breakpoint on parseAluExpression to debug parsing of mathematical expressions. +end + +define bp_pfc_initialize + break PythonFunctionCollection::Initialize +end +document bp_pfc_initialize + Set a breakpoint on PythonFunctionCollection::Initialize to debug the initialization of the Python Function Collection. +end + +define bp_func_setargvalue + break PythonFuncRec::setArgValue +end +document bp_func_setargvalue + Set a breakpoint on PythonFuncRec::setArgValue to debug setting external function arguments. +end + +define bp_func_call + break PythonFuncRec::Call +end +document bp_func_call + Set a breakpoint on PythonFuncRec::Call to debug calling external functions. +end + +# ============================================================================ +# USEFUL GDB SETTINGS FOR SPECS DEBUGGING +# ============================================================================ + +set print pretty on +set print array on +set print array-indexes on +set print object on +set print static-members on + +# ============================================================================ +# WELCOME MESSAGE +# ============================================================================ + +echo \n +echo ========================================\n +echo specs GDB debugging macros loaded\n +echo ========================================\n +echo \n +echo Available commands:\n +echo dump_pstate - Dump ProcessingState\n +echo dump_sb - Dump StringBuilder\n +echo dump_item - Dump Item (polymorphic)\n +echo dump_items - Dump itemGroup\n +echo dump_token - Dump Token\n +echo dump_alu_value - Dump ALUValue\n +echo dump_alu_counters - Dump ALUCounters\n +echo dump_alu_vec - Dump AluVec\n +echo dump_alu_function - Dump AluFunction\n +echo dump_external_func_rec - Dump ExternalFunctionRec\n +echo dump_python_func_rec - Dump PythonFuncRec\n +echo dump_python_func_arg - Dump PythonFuncArg\n +echo dump_exception - Dump SpecsException\n +echo \n +echo Breakpoint helpers:\n +echo bp_apply - Break on Item::apply\n +echo bp_getstr - Break on InputPart::getStr\n +echo bp_compile - Break on itemGroup::Compile\n +echo bp_parseAluExpression - Break on parseAluExpression, where expressions are parsed\n +echo bp_pfc_initialize - Break on PythonFunctionCollection::Initialize, where the Python Function Collection is initialized\n +echo bp_func_setargvalue - Break on PythonFuncRec::setArgValue, where an argument for an external function is set\n +echo bp_func_call - Break on PythonFuncRec::Call, where an external function is invoked\n +echo \n +echo For more help, type: help dump-processing-state\n +echo \n diff --git a/specs/src/gdb/specs_gdb.py b/specs/src/gdb/specs_gdb.py new file mode 100644 index 0000000..7c6f5a8 --- /dev/null +++ b/specs/src/gdb/specs_gdb.py @@ -0,0 +1,1481 @@ +#!/usr/bin/env python3 +""" +GDB debugging extension for specs. + +Provides pretty-printers and dump commands for all major classes and structs +in the specs codebase, organized by subsystem. + +Usage: + (gdb) source specs/src/gdb/specs.gdb + (gdb) dump-processing-state pState + (gdb) dump-item myItem + (gdb) dump-alu-value myALUValue +""" + +import gdb +import gdb.printing +import struct +import sys + +# ============================================================================ +# ENUM DECODE HELPERS +# ============================================================================ + +CLOCK_TYPE = { + 0: "Static", + 1: "Dynamic", + 2: "Diff", +} + +APPLY_RET = { + 0: "Continue", + 1: "ContinueWithDataWritten", + 2: "Write", + 3: "Read", + 4: "ReadStop", + 5: "EnterLoop", + 6: "DoneLoop", + 7: "EOF", + 8: "UNREAD", + 9: "ReDo", + 10: "Break", + 11: "SkipToNext", + 12: "SplitStart", + 13: "SplitContinue", +} + +ALU_COUNTER_TYPE = { + 0: "None", + 1: "Str", + 2: "Int", + 3: "Float", +} + +OUTPUT_ALIGNMENT = { + 0: "Left", + 1: "Center", + 2: "Right", + 3: "Composed", +} + +ALU_UNIT_TYPE = { + 0: "Invalid", + 1: "None", + 2: "OpenParenthesis", + 3: "ClosingParenthesis", + 4: "Comma", + 5: "Identifier", + 6: "LiteralNumber", + 7: "Counter", + 8: "FieldIdentifier", + 9: "UnaryOp", + 10: "BinaryOp", + 11: "AssignmentOp", + 12: "InputRecord", + 13: "Null", +} + +ALU_UNARY_OP = { + 0: "Plus", + 1: "Minus", + 2: "Not", +} + +ALU_BINARY_OP = { + 0: "Add", + 1: "Sub", + 2: "Mult", + 3: "Div", + 4: "IntDiv", + 5: "RemDiv", + 6: "Appnd", + 7: "LT", + 8: "LE", + 9: "GT", + 10: "GE", + 11: "SLT", + 12: "SLTE", + 13: "SGT", + 14: "SGTE", + 15: "EQ", + 16: "SEQ", + 17: "NE", + 18: "SNE", + 19: "AND", + 20: "OR", +} + +ALU_ASSN_OP = { + 0: "Let", + 1: "Add", + 2: "Sub", + 3: "Mult", + 4: "Div", + 5: "RemDiv", + 6: "IntDiv", + 7: "Appnd", +} + +CONDITION_PREDICATE = { + 0: "IF", + 1: "THEN", + 2: "ELSE", + 3: "ELSEIF", + 4: "ENDIF", + 5: "WHILE", + 6: "DO", + 7: "DONE", + 8: "ASSERT", +} + +RECORD_FORMAT = { + 0: "DELIMITED", + 1: "FIXED", + 2: "FIXED_DELIMITED", +} + +TIME_CLASSES = { + 0: "Initializing", + 1: "Processing", + 2: "IO", + 3: "InputQueue", + 4: "OutputQueue", + 5: "Draining", + 6: "Last", +} + +WRITER_TYPE = { + 0: "COUT", + 1: "CERR", + 2: "SHELL", + 3: "FILE", +} + +EXTERNAL_FUNC_ERROR_HANDLING = { + 0: "Throw", + 1: "NaN", + 2: "Zero", + 3: "NullStr", +} + +# Token types (X-macro generated, simplified list) +TOKEN_TYPES = { + 0: "STOP", + 1: "ALLEOF", + 2: "ANYEOF", + 3: "COUNTERS", + 4: "PRINTONLY", + 5: "EOF", + 6: "KEEP", + 7: "READ", + 8: "READSTOP", + 9: "WRITE", + 10: "NOWRITE", + 11: "ASSERT", + 12: "ABEND", + 13: "RANGELABEL", + 14: "ID", + 15: "PERIOD", + 16: "RANGE", + 17: "WORDRANGE", + 18: "FIELDSEPARATOR", + 19: "WORDSEPARATOR", + 20: "PAD", + 21: "NEXTWORD", + 22: "NEXTFIELD", + 23: "NEXT", + 24: "FIELDRANGE", + 25: "SUBSTRING", + 26: "OF", + 27: "GROUPSTART", + 28: "GROUPEND", + 29: "STRIP", + 30: "LITERAL", + 31: "CONVERSION", + 32: "LEFT", + 33: "CENTER", + 34: "RIGHT", + 35: "NUMBER", + 36: "TODCLOCK", + 37: "DTODCLOCK", + 38: "TIMEDIFF", + 39: "SET", + 40: "PRINT", + 41: "IF", + 42: "THEN", + 43: "ELSE", + 44: "ELSEIF", + 45: "ENDIF", + 46: "CONTINUE", + 47: "WHILE", + 48: "DO", + 49: "DONE", + 50: "UNREAD", + 51: "REDO", + 52: "BREAK", + 53: "SELECT", + 54: "FIRST", + 55: "SECOND", + 56: "OUTSTREAM", + 57: "STDERR", + 58: "REQUIRES", + 59: "SKIPUNTIL", + 60: "SKIPWHILE", + 61: "SPLITW", + 62: "SPLITF", + 63: "DUMMY", +} + +STRING_CONVERSIONS = { + 0: "identity", + 1: "ROT13", + 2: "C2B", + 3: "C2X", + 4: "B2C", + 5: "X2CH", + 6: "D2X", + 7: "X2D", + 8: "LCASE", + 9: "UCASE", + 10: "BSWAP", + 11: "ti2f", + 12: "tf2i", + 13: "s2tf", + 14: "tf2s", + 15: "mcs2tf", + 16: "tf2mcs", + 17: "NONE", +} + +# ============================================================================ +# UTILITY FUNCTIONS +# ============================================================================ + +def deref_shared_ptr(val): + """Dereference a std::shared_ptr to get the pointee.""" + try: + # libstdc++ layout: shared_ptr has _M_ptr member + ptr_val = val["_M_ptr"] + if ptr_val == 0: + return None + return ptr_val.dereference() + except: + try: + # Alternative: try to dereference directly + return val.dereference() + except: + return None + +def std_string_to_str(val): + """Extract a Python string from a std::string.""" + try: + # Try to get the string value directly + return val.string() + except: + try: + # Fallback: access _M_dataplus._M_p + return val["_M_dataplus"]["_M_p"].string() + except: + return "" + +def std_vector_size(val): + """Get the size of a std::vector.""" + try: + return int(val["_M_impl"]["_M_finish"] - val["_M_impl"]["_M_start"]) + except: + return 0 + +def identify_dynamic_type(val): + """ + Identify the actual derived type of a polymorphic object by reading the vtable. + Returns the demangled type name. + """ + try: + # Get the vtable pointer (first member of any polymorphic object) + vtable_ptr = val.address.cast(gdb.lookup_type("void").pointer().pointer()).dereference() + # Get the type info from the vtable (usually at offset -1) + typeinfo = vtable_ptr.cast(gdb.lookup_type("void").pointer().pointer())[-1] + # Try to get the name from RTTI + return gdb.execute(f"info symbol {typeinfo}", to_string=True).split()[0] + except: + return "Unknown" + +def call_method_safe(val, method_name, *args): + """ + Safely call a method on a value in the inferior. + Returns the result as a string, or None if the call fails. + """ + try: + arg_str = ", ".join(str(a) for a in args) + expr = f"(({val.type.name}*){val.address}).{method_name}({arg_str})" + result = gdb.parse_and_eval(expr) + return result + except: + return None + +# ============================================================================ +# PRETTY-PRINTERS +# ============================================================================ + +class ALUValuePrinter: + """Pretty-printer for ALUValue.""" + + def __init__(self, val): + self.val = val + + def to_string(self): + type_val = int(self.val["m_type"]) + type_str = ALU_COUNTER_TYPE.get(type_val, f"Unknown({type_val})") + value_str = std_string_to_str(self.val["m_value"]) + exact = bool(self.val["m_exact"]) + + return f"ALUValue {{type: {type_str}, value: \"{value_str}\", exact: {exact}}}" + +class TokenPrinter: + """Pretty-printer for Token.""" + + def __init__(self, val): + self.val = val + + def to_string(self): + type_val = int(self.val["m_type"]) + type_str = TOKEN_TYPES.get(type_val, f"Unknown({type_val})") + literal = std_string_to_str(self.val["m_literal"]) + argc = int(self.val["m_argc"]) + + return f"Token {{type: {type_str}, literal: \"{literal}\", argc: {argc}}}" + +class ProcessingStatePrinter: + """Abbreviated pretty-printer for ProcessingState.""" + + def __init__(self, val): + self.val = val + + def to_string(self): + try: + # Current record + ps = deref_shared_ptr(self.val["m_ps"]) + if ps: + record_str = std_string_to_str(ps) + else: + record_str = "" + + cycle = int(self.val["m_CycleCounter"]) + + return f"ProcessingState {{record: \"{record_str[:30]}...\", cycle: {cycle}}}" + except: + return "ProcessingState {}" + +class SpecsExceptionPrinter: + """Pretty-printer for SpecsException.""" + + def __init__(self, val): + self.val = val + + def to_string(self): + fn = self.val["fn"] + msg = std_string_to_str(self.val["msg"]) + ln = int(self.val["ln"]) + is_abend = bool(self.val["bIsAbend"]) + + abend_str = " [ABEND]" if is_abend else "" + return f"SpecsException {{{fn}:{ln}: {msg}{abend_str}}}" + +def build_pretty_printer(): + """Build and register all pretty-printers.""" + pp = gdb.printing.RegexPrettyPrinter("specs") + pp.add_printer("ALUValue", "^ALUValue$", ALUValuePrinter) + pp.add_printer("Token", "^Token$", TokenPrinter) + pp.add_printer("ProcessingState", "^ProcessingState$", ProcessingStatePrinter) + pp.add_printer("SpecsException", "^SpecsException$", SpecsExceptionPrinter) + return pp + +# ============================================================================ +# DUMP COMMANDS - InputPart Hierarchy +# ============================================================================ + +class DumpInputPart(gdb.Command): + """Dump an InputPart (polymorphic).""" + + def __init__(self): + super(DumpInputPart, self).__init__("dump-input-part", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + self._dump_input_part(val, 0) + except Exception as e: + print(f"Error: {e}") + + def _dump_input_part(self, val, indent=0): + prefix = " " * indent + + # Try to call Debug() to get the type-specific string + try: + debug_result = call_method_safe(val, "Debug") + if debug_result: + debug_str = str(debug_result) + else: + debug_str = "" + except: + debug_str = "" + + print(f"{prefix}InputPart @ {val.address}") + print(f"{prefix} Debug: {debug_str}") + + # Try to call virtual methods + try: + reads = call_method_safe(val, "readsLines") + print(f"{prefix} readsLines: {bool(reads)}") + except: + pass + + try: + forces = call_method_safe(val, "forcesRunoutCycle") + print(f"{prefix} forcesRunoutCycle: {bool(forces)}") + except: + pass + +class DumpLiteralPart(gdb.Command): + """Dump a LiteralPart.""" + + def __init__(self): + super(DumpLiteralPart, self).__init__("dump-literal-part", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + m_str = std_string_to_str(val["m_Str"]) + print(f"LiteralPart @ {val.address}") + print(f" m_Str: \"{m_str}\"") + except Exception as e: + print(f"Error: {e}") + +class DumpRangePart(gdb.Command): + """Dump a RangePart.""" + + def __init__(self): + super(DumpRangePart, self).__init__("dump-range-part", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + from_val = int(val["_from"]) + to_val = int(val["_to"]) + print(f"RangePart @ {val.address}") + print(f" _from: {from_val}") + print(f" _to: {to_val}") + print(f" readsLines: true") + except Exception as e: + print(f"Error: {e}") + +class DumpWordRangePart(gdb.Command): + """Dump a WordRangePart.""" + + def __init__(self): + super(DumpWordRangePart, self).__init__("dump-word-range-part", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + from_val = int(val["_from"]) + to_val = int(val["_to"]) + sep = std_string_to_str(val["m_WordSep"]) + print(f"WordRangePart @ {val.address}") + print(f" _from: {from_val}") + print(f" _to: {to_val}") + print(f" m_WordSep: \"{sep}\"") + except Exception as e: + print(f"Error: {e}") + +class DumpFieldRangePart(gdb.Command): + """Dump a FieldRangePart.""" + + def __init__(self): + super(DumpFieldRangePart, self).__init__("dump-field-range-part", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + from_val = int(val["_from"]) + to_val = int(val["_to"]) + sep = std_string_to_str(val["m_FieldSep"]) + print(f"FieldRangePart @ {val.address}") + print(f" _from: {from_val}") + print(f" _to: {to_val}") + print(f" m_FieldSep: \"{sep}\"") + except Exception as e: + print(f"Error: {e}") + +class DumpClockPart(gdb.Command): + """Dump a ClockPart.""" + + def __init__(self): + super(DumpClockPart, self).__init__("dump-clock-part", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + type_val = int(val["m_Type"]) + type_str = CLOCK_TYPE.get(type_val, f"Unknown({type_val})") + clock = int(val["m_StaticClock"]) + print(f"ClockPart @ {val.address}") + print(f" m_Type: {type_str}") + print(f" m_StaticClock: {clock}") + except Exception as e: + print(f"Error: {e}") + +class DumpIDPart(gdb.Command): + """Dump an IDPart.""" + + def __init__(self): + super(DumpIDPart, self).__init__("dump-id-part", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + fid = std_string_to_str(val["m_fieldIdentifier"]) + print(f"IDPart @ {val.address}") + print(f" m_fieldIdentifier: \"{fid}\"") + except Exception as e: + print(f"Error: {e}") + +class DumpExpressionPart(gdb.Command): + """Dump an ExpressionPart.""" + + def __init__(self): + super(DumpExpressionPart, self).__init__("dump-expression-part", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + raw_expr = std_string_to_str(val["m_rawExpression"]) + is_assn = bool(val["m_isAssignment"]) + print(f"ExpressionPart @ {val.address}") + print(f" m_rawExpression: \"{raw_expr}\"") + print(f" m_isAssignment: {is_assn}") + except Exception as e: + print(f"Error: {e}") + +# ============================================================================ +# DUMP COMMANDS - Item Hierarchy +# ============================================================================ + +class DumpItem(gdb.Command): + """Dump an Item (polymorphic).""" + + def __init__(self): + super(DumpItem, self).__init__("dump-item", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + orig_idx = int(val["m_originalIndex"]) + print(f"Item @ {val.address}") + print(f" m_originalIndex: {orig_idx}") + + # Try to call virtual methods + try: + debug_result = call_method_safe(val, "Debug") + if debug_result: + print(f" Debug: {debug_result}") + except: + pass + + try: + reads = call_method_safe(val, "readsLines") + print(f" readsLines: {bool(reads)}") + except: + pass + + try: + produces = call_method_safe(val, "producesOutput") + print(f" producesOutput: {bool(produces)}") + except: + pass + + try: + forces = call_method_safe(val, "forcesRunoutCycle") + print(f" forcesRunoutCycle: {bool(forces)}") + except: + pass + + try: + is_break = call_method_safe(val, "isBreak") + print(f" isBreak: {bool(is_break)}") + except: + pass + except Exception as e: + print(f"Error: {e}") + +class DumpDataField(gdb.Command): + """Dump a DataField.""" + + def __init__(self): + super(DumpDataField, self).__init__("dump-data-field", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + label = chr(int(val["m_label"])) if int(val["m_label"]) > 0 else "none" + out_start = int(val["m_outStart"]) + max_len = int(val["m_maxLength"]) + strip = bool(val["m_strip"]) + conv = int(val["m_conversion"]) + align = int(val["m_alignment"]) + + conv_str = STRING_CONVERSIONS.get(conv, f"Unknown({conv})") + align_str = OUTPUT_ALIGNMENT.get(align, f"Unknown({align})") + + print(f"DataField @ {val.address}") + print(f" m_label: {label}") + print(f" m_outStart: {out_start}") + print(f" m_maxLength: {max_len}") + print(f" m_strip: {strip}") + print(f" m_conversion: {conv_str}") + print(f" m_alignment: {align_str}") + except Exception as e: + print(f"Error: {e}") + +class DumpTokenItem(gdb.Command): + """Dump a TokenItem.""" + + def __init__(self): + super(DumpTokenItem, self).__init__("dump-token-item", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + token = deref_shared_ptr(val["mp_Token"]) + if token: + type_val = int(token["m_type"]) + type_str = TOKEN_TYPES.get(type_val, f"Unknown({type_val})") + print(f"TokenItem @ {val.address}") + print(f" Token type: {type_str}") + else: + print(f"TokenItem @ {val.address}") + print(f" mp_Token: ") + except Exception as e: + print(f"Error: {e}") + +class DumpSetItem(gdb.Command): + """Dump a SetItem.""" + + def __init__(self): + super(DumpSetItem, self).__init__("dump-set-item", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + raw_expr = std_string_to_str(val["m_rawExpression"]) + key = int(val["m_key"]) + print(f"SetItem @ {val.address}") + print(f" m_rawExpression: \"{raw_expr}\"") + print(f" m_key: {key}") + except Exception as e: + print(f"Error: {e}") + +class DumpSkipItem(gdb.Command): + """Dump a SkipItem.""" + + def __init__(self): + super(DumpSkipItem, self).__init__("dump-skip-item", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + raw_expr = std_string_to_str(val["m_rawExpression"]) + is_until = bool(val["m_bIsUntil"]) + satisfied = bool(val["m_bSatisfied"]) + skip_type = "SKIPUNTIL" if is_until else "SKIPWHILE" + print(f"SkipItem @ {val.address}") + print(f" Type: {skip_type}") + print(f" m_rawExpression: \"{raw_expr}\"") + print(f" m_bSatisfied: {satisfied}") + except Exception as e: + print(f"Error: {e}") + +class DumpConditionItem(gdb.Command): + """Dump a ConditionItem.""" + + def __init__(self): + super(DumpConditionItem, self).__init__("dump-condition-item", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + pred = int(val["m_pred"]) + pred_str = CONDITION_PREDICATE.get(pred, f"Unknown({pred})") + raw_expr = std_string_to_str(val["m_rawExpression"]) + is_assn = bool(val["m_isAssignment"]) + print(f"ConditionItem @ {val.address}") + print(f" m_pred: {pred_str}") + print(f" m_rawExpression: \"{raw_expr}\"") + print(f" m_isAssignment: {is_assn}") + except Exception as e: + print(f"Error: {e}") + +class DumpBreakItem(gdb.Command): + """Dump a BreakItem.""" + + def __init__(self): + super(DumpBreakItem, self).__init__("dump-break-item", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + ident = chr(int(val["m_identifier"])) + print(f"BreakItem @ {val.address}") + print(f" m_identifier: {ident}") + except Exception as e: + print(f"Error: {e}") + +class DumpSelectItem(gdb.Command): + """Dump a SelectItem.""" + + def __init__(self): + super(DumpSelectItem, self).__init__("dump-select-item", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + stream = int(val["m_stream"]) + b_output = bool(val["bOutput"]) + print(f"SelectItem @ {val.address}") + print(f" m_stream: {stream}") + print(f" bOutput: {b_output}") + except Exception as e: + print(f"Error: {e}") + +class DumpSplitItem(gdb.Command): + """Dump a SplitItem.""" + + def __init__(self): + super(DumpSplitItem, self).__init__("dump-split-item", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + is_field = bool(val["m_isField"]) + sep = std_string_to_str(val["m_separator"]) + splitting = bool(val["m_splitting"]) + current_piece = int(val["m_currentPiece"]) + split_type = "SPLITF" if is_field else "SPLITW" + print(f"SplitItem @ {val.address}") + print(f" Type: {split_type}") + print(f" m_separator: \"{sep}\"") + print(f" m_splitting: {splitting}") + print(f" m_currentPiece: {current_piece}") + except Exception as e: + print(f"Error: {e}") + +# ============================================================================ +# DUMP COMMANDS - itemGroup +# ============================================================================ + +class DumpItemGroup(gdb.Command): + """Dump an itemGroup.""" + + def __init__(self): + super(DumpItemGroup, self).__init__("dump-item-group", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + need_runout = bool(val["bNeedRunoutCycle"]) + found_second = bool(val["bFoundSelectSecond"]) + + # Get m_items vector + items_vec = val["m_items"] + item_count = std_vector_size(items_vec) + + print(f"itemGroup @ {val.address}") + print(f" bNeedRunoutCycle: {need_runout}") + print(f" bFoundSelectSecond: {found_second}") + print(f" Item count: {item_count}") + print(f" Items:") + + # Try to iterate items (simplified) + for i in range(min(item_count, 10)): # Limit to first 10 + try: + item = items_vec["_M_impl"]["_M_start"][i] + print(f" [{i}] @ {item.address}") + except: + pass + + if item_count > 10: + print(f" ... and {item_count - 10} more items") + except Exception as e: + print(f"Error: {e}") + +# ============================================================================ +# DUMP COMMANDS - Token System +# ============================================================================ + +class DumpToken(gdb.Command): + """Dump a Token.""" + + def __init__(self): + super(DumpToken, self).__init__("dump-token", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + type_val = int(val["m_type"]) + type_str = TOKEN_TYPES.get(type_val, f"Unknown({type_val})") + literal = std_string_to_str(val["m_literal"]) + argc = int(val["m_argc"]) + orig = std_string_to_str(val["m_orig"]) + + print(f"Token @ {val.address}") + print(f" m_type: {type_str}") + print(f" m_literal: \"{literal}\"") + print(f" m_argc: {argc}") + print(f" m_orig: \"{orig}\"") + except Exception as e: + print(f"Error: {e}") + +class DumpTokenRange(gdb.Command): + """Dump a TokenFieldRange.""" + + def __init__(self): + super(DumpTokenRange, self).__init__("dump-token-range", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + b_done = bool(val["bDone"]) + + # Try to access Simple range members + try: + first = int(val["m_first"]) + last = int(val["m_last"]) + print(f"TokenFieldRangeSimple @ {val.address}") + print(f" m_first: {first}") + print(f" m_last: {last}") + print(f" bDone: {b_done}") + except: + print(f"TokenFieldRange @ {val.address}") + print(f" bDone: {b_done}") + except Exception as e: + print(f"Error: {e}") + +# ============================================================================ +# DUMP COMMANDS - Processing +# ============================================================================ + +class DumpProcessingState(gdb.Command): + """Dump a ProcessingState.""" + + def __init__(self): + super(DumpProcessingState, self).__init__("dump-processing-state", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + + # Current record + ps = deref_shared_ptr(val["m_ps"]) + if ps: + record_str = std_string_to_str(ps) + else: + record_str = "" + + # Previous record + prev_ps = deref_shared_ptr(val["m_prevPs"]) + if prev_ps: + prev_record_str = std_string_to_str(prev_ps) + else: + prev_record_str = "" + + pad = chr(int(val["m_pad"])) + word_sep = std_string_to_str(val["m_wordSeparator"]) + field_sep = std_string_to_str(val["m_fieldSeparator"]) + cycle = int(val["m_CycleCounter"]) + extra_reads = int(val["m_ExtraReads"]) + word_count = int(val["m_wordCount"]) + field_count = int(val["m_fieldCount"]) + input_station = int(val["m_inputStation"]) + input_stream = int(val["m_inputStream"]) + output_idx = int(val["m_outputIndex"]) + no_write = bool(val["m_bNoWrite"]) + eof = bool(val["m_bEOF"]) + + print(f"ProcessingState @ {val.address}") + print(f" Current Record: \"{record_str[:50]}{'...' if len(record_str) > 50 else ''}\"") + print(f" Previous Record: \"{prev_record_str[:50]}{'...' if len(prev_record_str) > 50 else ''}\"") + print(f" Pad Char: '{pad}' (0x{ord(pad):02x})") + print(f" Word Separator: \"{word_sep}\"") + print(f" Field Separator: \"{field_sep}\"") + print(f" Cycle Counter: {cycle}") + print(f" Extra Reads: {extra_reads}") + print(f" Record Count: {cycle + extra_reads}") + print(f" Word Count: {word_count}") + print(f" Field Count: {field_count}") + print(f" Input Station: {input_station}") + print(f" Input Stream: {input_stream}") + print(f" Output Index: {output_idx}") + print(f" No Write: {no_write}") + print(f" EOF: {eof}") + except Exception as e: + print(f"Error: {e}") + +class DumpStringBuilder(gdb.Command): + """Dump a StringBuilder.""" + + def __init__(self): + super(DumpStringBuilder, self).__init__("dump-string-builder", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + + # Current string + mp_str = deref_shared_ptr(val["mp_str"]) + if mp_str: + str_content = std_string_to_str(mp_str) + else: + str_content = "" + + pos = int(val["m_pos"]) + pad = chr(int(val["m_pad"])) + + print(f"StringBuilder @ {val.address}") + print(f" Current String: \"{str_content[:50]}{'...' if len(str_content) > 50 else ''}\"") + print(f" Position: {pos}") + print(f" Pad Char: '{pad}' (0x{ord(pad):02x})") + except Exception as e: + print(f"Error: {e}") + +class DumpReader(gdb.Command): + """Dump a Reader.""" + + def __init__(self): + super(DumpReader, self).__init__("dump-reader", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + count_read = int(val["m_countRead"]) + count_used = int(val["m_countUsed"]) + b_abort = bool(val["m_bAbort"]) + b_ran_dry = bool(val["m_bRanDry"]) + + print(f"Reader @ {val.address}") + print(f" m_countRead: {count_read}") + print(f" m_countUsed: {count_used}") + print(f" m_bAbort: {b_abort}") + print(f" m_bRanDry: {b_ran_dry}") + except Exception as e: + print(f"Error: {e}") + +class DumpWriter(gdb.Command): + """Dump a Writer.""" + + def __init__(self): + super(DumpWriter, self).__init__("dump-writer", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + count_gen = int(val["m_countGenerated"]) + count_written = int(val["m_countWritten"]) + ended = bool(val["m_ended"]) + + print(f"Writer @ {val.address}") + print(f" m_countGenerated: {count_gen}") + print(f" m_countWritten: {count_written}") + print(f" m_ended: {ended}") + except Exception as e: + print(f"Error: {e}") + +# ============================================================================ +# DUMP COMMANDS - ALU +# ============================================================================ + +class DumpALUValue(gdb.Command): + """Dump an ALUValue.""" + + def __init__(self): + super(DumpALUValue, self).__init__("dump-alu-value", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + type_val = int(val["m_type"]) + type_str = ALU_COUNTER_TYPE.get(type_val, f"Unknown({type_val})") + value_str = std_string_to_str(val["m_value"]) + exact = bool(val["m_exact"]) + + print(f"ALUValue @ {val.address}") + print(f" m_type: {type_str}") + print(f" m_value: \"{value_str}\"") + print(f" m_exact: {exact}") + except Exception as e: + print(f"Error: {e}") + +class DumpALUCounters(gdb.Command): + """Dump ALUCounters.""" + + def __init__(self): + super(DumpALUCounters, self).__init__("dump-alu-counters", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + # m_map is a std::map + print(f"ALUCounters @ {val.address}") + print(f" Counters (map):") + # Simplified: just show the address + print(f" m_map @ {val['m_map'].address}") + except Exception as e: + print(f"Error: {e}") + +class DumpAluUnit(gdb.Command): + """Dump an AluUnit (polymorphic).""" + + def __init__(self): + super(DumpAluUnit, self).__init__("dump-alu-unit", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + + # Try to call virtual methods + try: + identify = call_method_safe(val, "_identify") + if identify: + print(f"AluUnit @ {val.address}") + print(f" _identify: {identify}") + except: + print(f"AluUnit @ {val.address}") + except Exception as e: + print(f"Error: {e}") + +class DumpAluVec(gdb.Command): + """Dump an AluVec (vector).""" + + def __init__(self): + super(DumpAluVec, self).__init__("dump-alu-vec", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + size = std_vector_size(val) + + print(f"AluVec @ {val.address}") + print(f" Size: {size}") + print(f" Units:") + + for i in range(min(size, 10)): + try: + unit = val["_M_impl"]["_M_start"][i] + print(f" [{i}] @ {unit.address}") + except: + pass + + if size > 10: + print(f" ... and {size - 10} more units") + except Exception as e: + print(f"Error: {e}") + +class DumpAluValueStats(gdb.Command): + """Dump AluValueStats.""" + + def __init__(self): + super(DumpAluValueStats, self).__init__("dump-alu-value-stats", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + int_count = int(val["m_intCount"]) + float_count = int(val["m_floatCount"]) + total_count = int(val["m_totalCount"]) + + print(f"AluValueStats @ {val.address}") + print(f" m_intCount: {int_count}") + print(f" m_floatCount: {float_count}") + print(f" m_totalCount: {total_count}") + except Exception as e: + print(f"Error: {e}") + +class DumpFrequencyMap(gdb.Command): + """Dump a frequencyMap.""" + + def __init__(self): + super(DumpFrequencyMap, self).__init__("dump-frequency-map", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + counter = int(val["counter"]) + + print(f"frequencyMap @ {val.address}") + print(f" counter (samples): {counter}") + print(f" map @ {val['map'].address}") + except Exception as e: + print(f"Error: {e}") + +# ============================================================================ +# DUMP COMMANDS - Utilities +# ============================================================================ + +class DumpException(gdb.Command): + """Dump a SpecsException.""" + + def __init__(self): + super(DumpException, self).__init__("dump-exception", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + fn = str(val["fn"]) + msg = std_string_to_str(val["msg"]) + ln = int(val["ln"]) + is_abend = bool(val["bIsAbend"]) + + abend_str = " [ABEND]" if is_abend else "" + print(f"SpecsException @ {val.address}") + print(f" File: {fn}") + print(f" Line: {ln}") + print(f" Message: {msg}{abend_str}") + except Exception as e: + print(f"Error: {e}") + +# ============================================================================ +# DUMP COMMANDS - Python Interface +# ============================================================================ + +class DumpAluFunction(gdb.Command): + """Dump an AluFunction.""" + + def __init__(self): + super(DumpAluFunction, self).__init__("dump-alu-function", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + func_name = std_string_to_str(val["m_FuncName"]) + arg_count = int(val["m_ArgCount"]) + relies_on_input = bool(val["m_reliesOnInput"]) + + print(f"AluFunction @ {val.address}") + print(f" m_FuncName: {func_name}") + print(f" m_ArgCount: {arg_count}") + print(f" m_reliesOnInput: {relies_on_input}") + except Exception as e: + print(f"Error: {e}") + +class DumpExternalFunctionRec(gdb.Command): + """Dump an ExternalFunctionRec (polymorphic) - calls virtual methods.""" + + def __init__(self): + super(DumpExternalFunctionRec, self).__init__("dump-external-function-rec", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + + print(f"ExternalFunctionRec @ {val.address}") + + # Try to call virtual methods to get information + try: + arg_count = call_method_safe(val, "GetArgCount") + print(f" GetArgCount(): {arg_count}") + except Exception as e: + print(f" GetArgCount(): (error: {e})") + + try: + func_ptr = call_method_safe(val, "GetFuncPtr") + print(f" GetFuncPtr(): {func_ptr}") + except Exception as e: + print(f" GetFuncPtr(): (error: {e})") + + # Try to detect actual derived type + try: + actual_type = identify_dynamic_type(val) + if actual_type and actual_type != "ExternalFunctionRec": + print(f" Actual type: {actual_type}") + except: + pass + except Exception as e: + print(f"Error: {e}") + +class DumpExternalFunctionCollection(gdb.Command): + """Dump an ExternalFunctionCollection.""" + + def __init__(self): + super(DumpExternalFunctionCollection, self).__init__("dump-external-function-collection", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + + # Try to call virtual methods + try: + is_init = call_method_safe(val, "IsInitialized") + print(f"ExternalFunctionCollection @ {val.address}") + print(f" IsInitialized: {bool(is_init)}") + except: + try: + count = call_method_safe(val, "CountFunctions") + print(f"ExternalFunctionCollection @ {val.address}") + print(f" CountFunctions: {count}") + except: + print(f"ExternalFunctionCollection @ {val.address}") + except Exception as e: + print(f"Error: {e}") + +class DumpPythonFunctionCollection(gdb.Command): + """Dump a PythonFunctionCollection (internal class from PythonIntf.cc).""" + + def __init__(self): + super(DumpPythonFunctionCollection, self).__init__("dump-python-function-collection", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + + # Access m_Initialized + try: + m_initialized = bool(val["m_Initialized"]) + print(f"PythonFunctionCollection @ {val.address}") + print(f" m_Initialized: {m_initialized}") + + # Try to access m_Functions map (simplified) + try: + m_functions = val["m_Functions"] + print(f" m_Functions @ {m_functions.address}") + except: + pass + except: + print(f"PythonFunctionCollection @ {val.address}") + except Exception as e: + print(f"Error: {e}") + +class DumpPythonFuncRec(gdb.Command): + """Dump a PythonFuncRec (internal class from PythonIntf.cc).""" + + def __init__(self): + super(DumpPythonFuncRec, self).__init__("dump-python-func-rec", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + + print(f"PythonFuncRec @ {val.address}") + + # Access members + try: + m_name = std_string_to_str(val["m_name"]) + print(f" m_name: {m_name}") + except Exception as e: + print(f" m_name: (error: {e})") + + try: + m_pFuncPtr = val["m_pFuncPtr"] + print(f" m_pFuncPtr: {m_pFuncPtr}") + except Exception as e: + print(f" m_pFuncPtr: (error: {e})") + + # Show m_doc + try: + m_doc = std_string_to_str(val["m_doc"]) + if m_doc: + # Format multi-line docs nicely + if "\n" in m_doc: + print(f" m_doc:") + for line in m_doc.split("\n"): + print(f" {line}") + else: + print(f" m_doc: {m_doc}") + else: + print(f" m_doc: (empty)") + except Exception as e: + print(f" m_doc: (error: {e})") + + # Show m_pTuple + try: + m_pTuple = val["m_pTuple"] + if m_pTuple == 0: + print(f" m_pTuple: nullptr") + else: + print(f" m_pTuple: {m_pTuple}") + except Exception as e: + print(f" m_pTuple: (error: {e})") + + # Expand m_args vector + try: + m_args = val["m_args"] + arg_size = std_vector_size(m_args) + print(f" m_args ({arg_size} items):") + + # Try to iterate and dump each argument + for i in range(arg_size): + try: + arg_elem = m_args[i] + arg_name = std_string_to_str(arg_elem["m_name"]) + arg_default = int(arg_elem["m_default"]) + arg_default_str = ALU_COUNTER_TYPE.get(arg_default, f"Unknown({arg_default})") + + print(f" [{i}] {arg_name} (default: {arg_default_str})") + + # Show default value if present + if arg_default == 1: # counterType__Str + try: + defStr = std_string_to_str(arg_elem["m_defStr"]) + print(f" = \"{defStr}\"") + except: + pass + elif arg_default == 2: # counterType__Int + try: + defInt = int(arg_elem["m_defInt"]) + print(f" = {defInt}") + except: + pass + elif arg_default == 3: # counterType__Float + try: + defFloat = float(arg_elem["m_defFloat"]) + print(f" = {defFloat}") + except: + pass + except Exception as arg_e: + print(f" [{i}] (error: {arg_e})") + except Exception as e: + print(f" m_args: (error: {e})") + except Exception as e: + print(f"Error: {e}") + +class DumpPythonFuncArg(gdb.Command): + """Dump a PythonFuncArg (internal class from PythonIntf.cc).""" + + def __init__(self): + super(DumpPythonFuncArg, self).__init__("dump-python-func-arg", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + + # Access members + try: + m_name = std_string_to_str(val["m_name"]) + m_default = int(val["m_default"]) + m_default_str = ALU_COUNTER_TYPE.get(m_default, f"Unknown({m_default})") + + print(f"PythonFuncArg @ {val.address}") + print(f" m_name: {m_name}") + print(f" m_default: {m_default_str}") + + # Try to get default value + if m_default == 1: # counterType__Str + try: + m_defStr = std_string_to_str(val["m_defStr"]) + print(f" m_defStr: \"{m_defStr}\"") + except: + pass + elif m_default == 2: # counterType__Int + try: + m_defInt = int(val["m_defInt"]) + print(f" m_defInt: {m_defInt}") + except: + pass + elif m_default == 3: # counterType__Float + try: + m_defFloat = float(val["m_defFloat"]) + print(f" m_defFloat: {m_defFloat}") + except: + pass + except Exception as inner_e: + print(f"PythonFuncArg @ {val.address}") + print(f" (Error reading members: {inner_e})") + except Exception as e: + print(f"Error: {e}") + +# ============================================================================ +# CONVENIENCE COMMAND +# ============================================================================ + +class DumpAll(gdb.Command): + """Dump all relevant debugging info (ProcessingState + StringBuilder + current Item).""" + + def __init__(self): + super(DumpAll, self).__init__("dump-all", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + print("=== Full Debug Dump ===") + print("\nNote: Provide arguments like: dump-all pState sb item") + print("This is a convenience command for dumping multiple objects at once.") + +# ============================================================================ +# REGISTRATION +# ============================================================================ + +def register_commands(): + """Register all dump commands.""" + # InputPart commands + DumpInputPart() + DumpLiteralPart() + DumpRangePart() + DumpWordRangePart() + DumpFieldRangePart() + DumpClockPart() + DumpIDPart() + DumpExpressionPart() + + # Item commands + DumpItem() + DumpDataField() + DumpTokenItem() + DumpSetItem() + DumpSkipItem() + DumpConditionItem() + DumpBreakItem() + DumpSelectItem() + DumpSplitItem() + + # itemGroup command + DumpItemGroup() + + # Token commands + DumpToken() + DumpTokenRange() + + # Processing commands + DumpProcessingState() + DumpStringBuilder() + DumpReader() + DumpWriter() + + # ALU commands + DumpALUValue() + DumpALUCounters() + DumpAluUnit() + DumpAluVec() + DumpAluValueStats() + DumpFrequencyMap() + + # Python interface commands + DumpAluFunction() + DumpExternalFunctionRec() + DumpExternalFunctionCollection() + DumpPythonFunctionCollection() + DumpPythonFuncRec() + DumpPythonFuncArg() + + # Utility commands + DumpException() + DumpAll() + +# Register pretty-printers (only if an object file is loaded) +try: + objfile = gdb.objfile.current_objfile() + if objfile: + gdb.printing.register_pretty_printer(objfile, build_pretty_printer()) +except: + # No object file loaded yet; pretty-printers will be registered when one is loaded + pass + +# Register all commands +register_commands() + +print("specs GDB extension loaded successfully") From 4d656332a75d241b10f155ece7c97fe10f957893 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Sun, 17 May 2026 11:09:25 +0300 Subject: [PATCH 11/50] Issue #374 - Allow Python function to mark return value with exactness (#377) --- manpage | 20 ++++++++++- specs/docs/pyfuncs.md | 51 ++++++++++++++++++++++++++++ specs/src/utils/PythonIntf.cc | 51 ++++++++++++++++++++++++++++ specs/src/utils/alu.cc | 8 ++++- specs/tests/pytest.py | 64 +++++++++++++++++++++++++++++++++++ 5 files changed, 192 insertions(+), 2 deletions(-) diff --git a/manpage b/manpage index 71891bf..7875139 100644 --- a/manpage +++ b/manpage @@ -990,7 +990,9 @@ Returns whether the evaluation of .I expression results in an exact result (1) or not (0). .B NOTE: -This function has some limitations. For example, Python functions are always taken to return an exact value when they return an integer or a string, but an inexact value when they return a float, which may or may not be correct. When unsure, the +Python functions use a heuristic where integer and string returns are exact, while float returns are inexact. This may not always be correct. Python functions can override this by returning a 2-tuple +.B (value, exactness) +to explicitly specify exactness. When unsure, the .B exact function errs on the side of returning 0. @@ -1807,6 +1809,22 @@ specs print "commas(word(1))" 1.12 RIGHT .PP Python functions may return an integer, floating-point number, string, or .B None. +By default, integer and string returns are marked as exact, while floating-point +returns are marked as inexact. To override this heuristic, a function may instead +return a 2-tuple +.B (value, exactness) +where +.I value +is the return value (integer, float, string, or None) and +.I exactness +is a Python boolean +.B True +or +.B False. +For example, +.B return (3.14159, True) +returns an exact floating-point value. +.PP Docstrings are used by .B specs --help pyfuncs when listing loaded Python functions. Use diff --git a/specs/docs/pyfuncs.md b/specs/docs/pyfuncs.md index c07738a..a4b25b0 100644 --- a/specs/docs/pyfuncs.md +++ b/specs/docs/pyfuncs.md @@ -102,3 +102,54 @@ specs set "#0:=countocc(@@,'hello')" EOF print "#0" 1 ``` This counts the lines in the input that included the word 'hello'. +## Exactness + +By default, **specs** applies a heuristic to determine whether a Python function's return value is exact or inexact: +- Integer returns are marked as **exact** +- String returns are marked as **exact** +- Floating-point returns are marked as **inexact** + +This heuristic may not always be correct. For example, a function that computes π should return an inexact value, but a function that computes a well-defined mathematical constant might return an exact value. + +To override the default heuristic, a Python function can return a 2-tuple instead of a plain value: +```python +def exact_pi(): + '''Return an exact value of pi''' + return (3.141592653589793, True) + +def inexact_sqrt(): + '''Return an inexact square root''' + return (2.23606797749979, False) +``` + +The tuple must have exactly 2 elements: +1. The first element is the return value (integer, float, string, or None) +2. The second element is a Python boolean: `True` for exact, `False` for inexact + +If the tuple is malformed (wrong number of elements, or second element is not a boolean), **specs** will report an error. + +For example: +```python +def good_exact(): + return (42, True) # OK: exact integer + +def good_inexact(): + return (3.14, False) # OK: inexact float + +def bad_tuple_size(): + return (1, 2, 3) # ERROR: tuple has 3 elements, not 2 + +def bad_exactness_type(): + return (1.5, 1) # ERROR: second element is int, not bool +``` + +The `exact()` built-in function can be used to check whether a value is exact: +```python +specs print "exact(exact_pi())" 1 +``` +would print `1` (true), while: +```python +specs print "exact(inexact_sqrt())" 1 +``` +would print `0` (false). + diff --git a/specs/src/utils/PythonIntf.cc b/specs/src/utils/PythonIntf.cc index e554616..58f9169 100644 --- a/specs/src/utils/PythonIntf.cc +++ b/specs/src/utils/PythonIntf.cc @@ -156,6 +156,8 @@ class PythonFuncRec : public ExternalFunctionRec { PValue Call() { PValue pRet = nullptr; + bool exactnessSpecified = false; + bool exactness = true; // Check that all values were passed, complete those that haven't for (size_t i=0 ; isetExact(exactness); + } } else { if (PyErr_Occurred()) { switch (g_errorHandling) { diff --git a/specs/src/utils/alu.cc b/specs/src/utils/alu.cc index ebd8275..6f4325a 100644 --- a/specs/src/utils/alu.cc +++ b/specs/src/utils/alu.cc @@ -1111,7 +1111,13 @@ void dumpAluStack(const char* title, std::stack& stk) while (!stk.empty()) { PValue v = stk.top(); stk.pop(); - std::cerr << " > " << (v ? v->getStr() : "(nil)") << std::endl; + if (v) { + std::cerr << " > " << v->getStr() << " (" + << ((v->isExact()) ? "exact " : "inexact ") + << ALUCounterType2Str[v->getType()] << ")" << std::endl; + } else { + std::cerr << " > (nil)" << std::endl; + } tmp.push(v); } std::cerr << std::endl; diff --git a/specs/tests/pytest.py b/specs/tests/pytest.py index c80b78b..a5bc45a 100644 --- a/specs/tests/pytest.py +++ b/specs/tests/pytest.py @@ -149,3 +149,67 @@ def called_how_many_times(): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + +# Test exactness feature - exact float +lff = ''' +def exact_float(): + """Return an exact floating-point value""" + return (3.14159, True) + +def inexact_float(): + """Return an inexact floating-point value""" + return (3.14159, False) + +def inexact_int(): + """Return an inexact integer (overriding default)""" + return (42, False) + +def bad_tuple_size(): + """Return a tuple with wrong size""" + return (1, 2, 3) + +def bad_exactness_type(): + """Return a tuple with non-bool exactness""" + return (1.5, 1) +''' +set_localfuncs(lff) + +# Test exact float with True +sys.stdout.write("Test 13 (exact float with True) -- ") +ret = run_cmd('print "exact(exact_float())" 1') +if ret=="1": + sys.stdout.write("OK\n") +else: + sys.stdout.write("Not OK: <"+ret+">\n") + +# Test exact float with False +sys.stdout.write("Test 14 (inexact float with False) -- ") +ret = run_cmd('print "exact(inexact_float())" 1') +if ret=="0": + sys.stdout.write("OK\n") +else: + sys.stdout.write("Not OK: <"+ret+">\n") + +# Test overriding default exact int with False +sys.stdout.write("Test 15 (inexact int override) -- ") +ret = run_cmd('print "exact(inexact_int())" 1') +if ret=="0": + sys.stdout.write("OK\n") +else: + sys.stdout.write("Not OK: <"+ret+">\n") + +# Test bad tuple size +sys.stdout.write("Test 16 (bad tuple size) -- ") +ret = run_cmd('print "bad_tuple_size()" 1') +if "Invalid tuple returned from function bad_tuple_size" in ret: + sys.stdout.write("OK\n") +else: + sys.stdout.write("Not OK: <"+ret+">\n") + +# Test bad exactness type +sys.stdout.write("Test 17 (bad exactness type) -- ") +ret = run_cmd('print "bad_exactness_type()" 1') +if "Invalid exactness value returned from function bad_exactness_type" in ret: + sys.stdout.write("OK\n") +else: + sys.stdout.write("Not OK: <"+ret+">\n") From 01aca5058c655073b85fdd1789e708e205972721 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Sun, 17 May 2026 14:04:53 +0300 Subject: [PATCH 12/50] Add pytest to run_tests target (#379) --- specs/src/setup.py | 7 ++++ specs/tests/pytest.py | 65 ++++++++++++++++++++++++++++---------- specs/tests/recfm_tests.py | 3 +- 3 files changed, 58 insertions(+), 17 deletions(-) diff --git a/specs/src/setup.py b/specs/src/setup.py index e76c132..6221f9d 100644 --- a/specs/src/setup.py +++ b/specs/src/setup.py @@ -734,6 +734,13 @@ def python_search(arg): if osversion != "": condlink = condlink + " -mmacosx-version-min={}".format(osversion) +# Add pytest.py to run_tests if Python support is available +if CFG_python: + body2 = body2.replace( + "python3 $(TESTS_DIR)/recfm_tests.py", + "python3 $(TESTS_DIR)/recfm_tests.py\n\tpython3 $(TESTS_DIR)/pytest.py" + ) + with open("Makefile", "w") as makefile: makefile.write("CXX={}\n".format(cxx)) makefile.write("LINKER={}\n".format("link.exe" if (compiler=="VS") else cxx)) diff --git a/specs/tests/pytest.py b/specs/tests/pytest.py index a5bc45a..4fcd826 100644 --- a/specs/tests/pytest.py +++ b/specs/tests/pytest.py @@ -1,23 +1,37 @@ -import os,sys +import os,sys,subprocess def run_cmd(spec, force=False): + args = ["../exe/specs", "--set", "SPECSPATH=/tmp", "-o", "theout"] if force: - cmd = "../exe/specs --pythonFuncs on --set SPECSPATH=/tmp -o theout " + spec + "&> theerr" + args.extend(["--pythonFuncs", "on"]) + args.append(spec) + + with open("theerr", "w") as err_file: + rc = subprocess.call(args, stdout=subprocess.DEVNULL, stderr=err_file) + + if os.path.exists("theerr"): + with open("theerr", "r") as f: + theerr_content = f.readlines() + os.system("/bin/rm theerr") else: - cmd = "../exe/specs --set SPECSPATH=/tmp -o theout " + spec + "&> theerr" - rc = os.system(cmd) - if rc!=0 and rc!=2048: - ret = "RC="+str(rc) - elif rc==0 and os.path.exists("theout"): - with open("theout","r") as out: - ret = out.read() + theerr_content = "" + + if os.path.exists("theout"): + with open("theout", "r") as f: + theout_content = f.read() os.system("/bin/rm theout") - elif os.path.exists("theerr"): - with open("theerr","r") as err: - ret = err.readlines()[-1] - os.system("/bin/rm theerr") + else: + theout_content = "" + + if rc!=0 and rc!=8: + ret = "RC="+str(rc) + elif rc==0 and theout_content != "": + ret = theout_content + elif theerr_content != "": + ret = theerr_content[-1] else: ret = "something happened" + return ret.strip() def set_localfuncs(lf): @@ -39,6 +53,7 @@ def set_localfuncs(lf): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # while still not having any file there, try calling the kuku function sys.stdout.write("Test 02 (unknown function; no file) -- ") @@ -47,6 +62,7 @@ def set_localfuncs(lf): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # So let's try loading an invalid file lff = ''' @@ -62,22 +78,25 @@ def set_localfuncs(lf): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # But if we force it... sys.stdout.write("Test 04 (bad file; non-python function; force) -- ") ret = run_cmd('print "sqrt(81)" 1', True) -if ret=="Python Interface: Error loading local functions": +if ret=="Python Interface: Error loading local functions" or ret=="SyntaxError: invalid syntax": sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # Or call a non-built-in function... sys.stdout.write("Test 05 (bad file; unknown function) -- ") ret = run_cmd('print "kuku(16)" 1') -if ret=="Python Interface: Error loading local functions": +if ret=="Python Interface: Error loading local functions" or ret=="SyntaxError: invalid syntax": sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # Now for a valid file lff = ''' @@ -101,14 +120,16 @@ def called_how_many_times(): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # FP parameter sys.stdout.write("Test 07 (float parameter) -- ") ret = run_cmd('print "plus1(3.2)" 1') -if ret=="4.2": +if abs(float(ret)-4.2) < 0.0001: sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # string parameter - should abend sys.stdout.write("Test 08 (bad parameter; should abend) -- ") @@ -117,6 +138,7 @@ def called_how_many_times(): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # a function with memory sys.stdout.write("Test 09 (function with memory; first run) -- ") @@ -125,6 +147,7 @@ def called_how_many_times(): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # a function with memory sys.stdout.write("Test 10 (function with memory; second run) -- ") @@ -133,6 +156,7 @@ def called_how_many_times(): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # a function that does not exist sys.stdout.write("Test 11 (non-existent function) -- ") @@ -141,6 +165,7 @@ def called_how_many_times(): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # call a python imported function sys.stdout.write("Test 12 (imported function) -- ") @@ -149,6 +174,7 @@ def called_how_many_times(): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # Test exactness feature - exact float lff = ''' @@ -181,6 +207,7 @@ def bad_exactness_type(): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # Test exact float with False sys.stdout.write("Test 14 (inexact float with False) -- ") @@ -189,6 +216,7 @@ def bad_exactness_type(): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # Test overriding default exact int with False sys.stdout.write("Test 15 (inexact int override) -- ") @@ -197,6 +225,7 @@ def bad_exactness_type(): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # Test bad tuple size sys.stdout.write("Test 16 (bad tuple size) -- ") @@ -205,6 +234,7 @@ def bad_exactness_type(): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) # Test bad exactness type sys.stdout.write("Test 17 (bad exactness type) -- ") @@ -213,3 +243,6 @@ def bad_exactness_type(): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) + +sys.stdout.write("\n*** All 17 tests passed.\n") \ No newline at end of file diff --git a/specs/tests/recfm_tests.py b/specs/tests/recfm_tests.py index ff27d9d..9fc0614 100644 --- a/specs/tests/recfm_tests.py +++ b/specs/tests/recfm_tests.py @@ -303,4 +303,5 @@ def cleanup(): sys.stdout.write("*** {} out of {} tests FAILED.\n".format(fail_counter, case_counter)) sys.exit(1) else: - sys.stdout.write("*** All {} tests passed.\n".format(case_counter)) + sys.stdout.write("*** All {} tests passed.\n\n".format(case_counter)) + From b19b58aefc18fe1aca76e2658f290da1eafb727f Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Sun, 17 May 2026 17:18:33 +0300 Subject: [PATCH 13/50] Issue #344 - Allow Python functions with exactness in args (#380) --- manpage | 28 ++++++++++ specs/docs/pyfuncs.md | 91 ++++++++++++++++++++++++++++++++ specs/src/gdb/specs_gdb.py | 7 +++ specs/src/utils/PythonIntf.cc | 72 ++++++++++++++++++++++++- specs/tests/pytest.py | 98 ++++++++++++++++++++++++++++++++++- 5 files changed, 293 insertions(+), 3 deletions(-) diff --git a/manpage b/manpage index 7875139..7183d75 100644 --- a/manpage +++ b/manpage @@ -1825,6 +1825,34 @@ For example, .B return (3.14159, True) returns an exact floating-point value. .PP +By default, Python functions receive their arguments as plain values with exactness information stripped. +To allow a function to receive exactness information with its arguments, set the +.B arg_type +attribute on the function to the string +.B \[dq]exact\[dq]. +When +.B arg_type = \[dq]exact\[dq] +is set, every argument is passed as a 2-tuple +.B (value, exactness) +where +.I value +is the argument value and +.I exactness +is a Python boolean +.B True +or +.B False +indicating whether the value is exact. This allows functions to inspect and propagate exactness information. +For example: +.RS +.nf +def lowindex(x): + return (int(x[0]) % 65536, x[1]) + +lowindex.arg_type = "exact" +.fi +.RE +.PP Docstrings are used by .B specs --help pyfuncs when listing loaded Python functions. Use diff --git a/specs/docs/pyfuncs.md b/specs/docs/pyfuncs.md index a4b25b0..01ed26c 100644 --- a/specs/docs/pyfuncs.md +++ b/specs/docs/pyfuncs.md @@ -153,3 +153,94 @@ specs print "exact(inexact_sqrt())" 1 ``` would print `0` (false). +## Argument Exactness + +By default, Python functions receive their arguments as plain values, and exactness information is discarded. However, you can configure a function to receive exactness information with its arguments by setting the `arg_type` attribute to `"exact"`. + +### Setting `arg_type = "exact"` + +To enable argument exactness, add this line after your function definition: + +```python +def my_function(x, y): + # function body + pass + +my_function.arg_type = "exact" +``` + +When `arg_type = "exact"` is set, **all arguments** are passed as 2-tuples instead of plain values: +- The first element is the argument value (integer, float, string, or None) +- The second element is a Python boolean: `True` if the value is exact, `False` if inexact + +### Example: Propagating Exactness + +Here's a practical example that propagates exactness information: + +```python +def lowindex(x): + '''Returns the lower 16 bits of the number, and copies exactness''' + return (int(x[0]) % 65536, x[1]) + +lowindex.arg_type = "exact" + +def highindex(x): + '''Returns the top 16 bits - assuming x is limited to 32 bits''' + return int(x) // 65536 +``` + +In this example: +- `lowindex` receives its argument as a 2-tuple `(value, exactness)` and returns a 2-tuple preserving the exactness +- `highindex` receives a plain value (no `arg_type` set) and returns a plain integer +- Note that there is no requirement to combine argument exactness and return exactness. The value returned from `highindex` is an integer, and therefore defaults to exact, even if the value passed to this function was originally marked as inexact. + +You can verify the behavior with: + +``` +specs print "exact(lowindex(4/3))" 1 print "exact(lowindex(4))" NEXTWORD print "exact(highindex(4/3))" NEXTWORD +``` + +This prints `0 1 1`: +- `lowindex(4/3)` receives an inexact value (4/3), returns it with exactness=0 +- `lowindex(4)` receives an exact value (4), returns it with exactness=1 +- `highindex(4/3)` receives a plain value, returns an integer (which is exact by default), so exactness=1 + +### Combining Argument and Return Exactness + +A function can both receive argument exactness and return exactness information. For example: + +```python +def add_exact(a, b): + '''Add two numbers, exact only if both inputs are exact''' + return (a[0] + b[0], a[1] and b[1]) + +add_exact.arg_type = "exact" +``` + +This function: +1. Receives both arguments as 2-tuples (because `arg_type = "exact"`) +2. Returns a 2-tuple with the sum and a boolean indicating exactness (both inputs must be exact) +3. Is incorrect. If `a` and `b` are floats, and one is significantly bigger than the other in absolute value, there is going to be some rounding. In general float addition results in an inexact value. + +### Error Handling + +If `arg_type` is set to a value other than `"exact"`, **specs** will report an error during initialization: + +```python +def bad_function(x): + pass + +bad_function.arg_type = "bogus" # ERROR: Invalid arg_type value +``` + +Also, if a function has `arg_type = "exact"` but tries to use an argument as a plain value (or vice versa), Python will raise a `TypeError`: + +```python +def bad_use(x): + return x + 1 # ERROR: can't add tuple + int + +bad_use.arg_type = "exact" +``` + +When such errors occur, **specs** will report them as external function errors. + diff --git a/specs/src/gdb/specs_gdb.py b/specs/src/gdb/specs_gdb.py index 7c6f5a8..af1ca04 100644 --- a/specs/src/gdb/specs_gdb.py +++ b/specs/src/gdb/specs_gdb.py @@ -1294,6 +1294,13 @@ def invoke(self, arg, from_tty): except Exception as e: print(f" m_doc: (error: {e})") + # Show m_argTypeExact + try: + m_argTypeExact = bool(val["m_argTypeExact"]) + print(f" m_argTypeExact: {m_argTypeExact}") + except Exception as e: + print(f" m_argTypeExact: (error: {e})") + # Show m_pTuple try: m_pTuple = val["m_pTuple"] diff --git a/specs/src/utils/PythonIntf.cc b/specs/src/utils/PythonIntf.cc index 58f9169..368a239 100644 --- a/specs/src/utils/PythonIntf.cc +++ b/specs/src/utils/PythonIntf.cc @@ -106,6 +106,9 @@ class PythonFuncRec : public ExternalFunctionRec { void setDoc(const char* cstr) { m_doc = cstr; } + void setArgTypeExact(bool v) { m_argTypeExact = v; } + bool isArgTypeExact() const { return m_argTypeExact; } + void setArgValue(size_t idx, PValue pValue) { PyObject* pValObj; size_t argCount = GetArgCount(); @@ -151,6 +154,16 @@ class PythonFuncRec : public ExternalFunctionRec { Py_INCREF(Py_None); } + // If arg_type=exact, wrap the value as a (value, exactness) tuple + if (m_argTypeExact) { + PyObject* pExactness = pValue->isExact() ? Py_True : Py_False; + Py_INCREF(pExactness); + PyObject* pArgTuple = PyTuple_New(2); + PyTuple_SetItem(pArgTuple, 0, pValObj); + PyTuple_SetItem(pArgTuple, 1, pExactness); + pValObj = pArgTuple; + } + PyTuple_SetItem(m_pTuple, idx, pValObj); } @@ -291,6 +304,9 @@ class PythonFuncRec : public ExternalFunctionRec { strm << arg.getStr(); } strm << ")"; + if (m_argTypeExact) { + strm << " [arg_type=exact]"; + } if (m_doc.length() > 0) { if (m_doc.find("\n") != std::string::npos) { strm << " :\n" << m_doc << "\n"; @@ -306,6 +322,7 @@ class PythonFuncRec : public ExternalFunctionRec { std::vector m_args; PyObject* m_pTuple; std::string m_doc; + bool m_argTypeExact = false; }; typedef std::shared_ptr PPythonFuncRec; @@ -447,7 +464,16 @@ class PythonFunctionCollection : public ExternalFunctionCollection { PyTuple_SetItem(pTuple, 0, pFunc); PyObject* pArgSpec = PyObject_CallObject(pArgSpecFunc, pTuple); - MYASSERT_NOT_NULL_WITH_DESC(pArgSpec,funcName); + if (!pArgSpec) { + // getargspec failed (e.g., for built-in C functions) - skip this function + PyErr_Clear(); + Py_DECREF(pRepr); +#ifdef PYTHON_VER_3 + Py_DECREF(pStr); +#endif + Py_DECREF(pTuple); + continue; + } PyObject* pArgList = PyObject_GetAttrString(pArgSpec, "args"); MYASSERT_NOT_NULL(pArgList); @@ -517,6 +543,50 @@ class PythonFunctionCollection : public ExternalFunctionCollection { } Py_DECREF(pDoc); } + + // Check for arg_type attribute + if (PyObject_HasAttrString(pFunc, "arg_type")) { + PyObject* pArgType = PyObject_GetAttrString(pFunc, "arg_type"); + MYASSERT_NOT_NULL(pArgType); + + // Check if it's a string equal to "exact" + bool isExact = false; +#ifdef PYTHON_VER_2 + if (PyString_Check(pArgType)) { + if (0 == strcmp(PyString_AS_STRING(pArgType), "exact")) { + isExact = true; + } + } +#else + if (PyUnicode_Check(pArgType)) { + PyObject* pArgTypeBytes = PyUnicode_AsASCIIString(pArgType); + if (pArgTypeBytes) { + if (0 == strcmp(PyBytes_AS_STRING(pArgTypeBytes), "exact")) { + isExact = true; + } + Py_DECREF(pArgTypeBytes); + } + } +#endif + + if (isExact) { + pFuncRec->setArgTypeExact(true); + } else { + // arg_type exists but is not "exact" - error + std::string err = "Invalid arg_type value for function "; + err += funcName; + err += " - must be \"exact\""; + if (g_bVerbose) { + PyObject* pRepr = PyObject_Repr(pArgType); + err += ". Got: "; + err += PyUnicode_AsUTF8(pRepr); + Py_DECREF(pRepr); + } + Py_DECREF(pArgType); + MYTHROW(err); + } + Py_DECREF(pArgType); + } } Py_DECREF(pRepr); #ifdef PYTHON_VER_3 diff --git a/specs/tests/pytest.py b/specs/tests/pytest.py index 4fcd826..c313673 100644 --- a/specs/tests/pytest.py +++ b/specs/tests/pytest.py @@ -100,10 +100,13 @@ def set_localfuncs(lf): # Now for a valid file lff = ''' -from math import factorial +from math import factorial as _factorial def plus1(a): return a+1 + +def factorial(n): + return _factorial(int(n)) calling_count = 0 def called_how_many_times(): @@ -197,6 +200,26 @@ def bad_tuple_size(): def bad_exactness_type(): """Return a tuple with non-bool exactness""" return (1.5, 1) + +def lowindex(x): + """Returns the lower 16 bits of the number, and copies exactness""" + return (int(x[0]) % 65536, x[1]) + +lowindex.arg_type = "exact" + +def highindex(x): + """Returns the top 16 bits""" + return int(x) // 65536 + +def bad_exact_use(x): + """Tries to use a tuple arg as a number - will TypeError""" + return x + 1 + +bad_exact_use.arg_type = "exact" + +def bad_plain_use(x): + """Tries to index into a plain value - will TypeError""" + return x[0] ''' set_localfuncs(lff) @@ -245,4 +268,75 @@ def bad_exactness_type(): sys.stdout.write("Not OK: <"+ret+">\n") exit(4) -sys.stdout.write("\n*** All 17 tests passed.\n") \ No newline at end of file +# Test lowindex with inexact argument +sys.stdout.write("Test 18 (lowindex with inexact arg) -- ") +ret = run_cmd('print "exact(lowindex(4/3))" 1') +if ret=="0": + sys.stdout.write("OK\n") +else: + sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) + +# Test lowindex with exact argument +sys.stdout.write("Test 19 (lowindex with exact arg) -- ") +ret = run_cmd('print "exact(lowindex(4))" 1') +if ret=="1": + sys.stdout.write("OK\n") +else: + sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) + +# Test highindex without arg_type (normal behavior) +sys.stdout.write("Test 20 (highindex without arg_type) -- ") +ret = run_cmd('print "exact(highindex(4/3))" 1') +if ret=="1": + sys.stdout.write("OK\n") +else: + sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) + +# Test combined spec from problem statement +sys.stdout.write("Test 21 (combined spec) -- ") +ret = run_cmd('print "exact(lowindex(4/3))" 1 print "exact(lowindex(4))" NEXTWORD print "exact(highindex(4/3))" NEXTWORD') +if ret=="0 1 1": + sys.stdout.write("OK\n") +else: + sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) + +# Test bad_exact_use - function tries to use tuple arg as number +sys.stdout.write("Test 22 (bad_exact_use - TypeError) -- ") +ret = run_cmd('print "bad_exact_use(5)" 1') +if "Runtime error. Error in external function" in ret: + sys.stdout.write("OK\n") +else: + sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) + +# Test bad_plain_use - function tries to index plain value +sys.stdout.write("Test 23 (bad_plain_use - TypeError) -- ") +ret = run_cmd('print "bad_plain_use(5)" 1') +if "Runtime error. Error in external function" in ret: + sys.stdout.write("OK\n") +else: + sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) + +# Test bad_arg_type - invalid arg_type value +sys.stdout.write("Test 24 (bad_arg_type - invalid value) -- ") +lff_bad = ''' +def bad_arg_type(): + """Function with invalid arg_type value""" + return 42 + +bad_arg_type.arg_type = "bogus" +''' +set_localfuncs(lff_bad) +ret = run_cmd('print "bad_arg_type()" 1', True) +if "Invalid arg_type value for function bad_arg_type" in ret: + sys.stdout.write("OK\n") +else: + sys.stdout.write("Not OK: <"+ret+">\n") + exit(4) + +sys.stdout.write("\n*** All 24 tests passed.\n") From e111daa9e87ad413c0b4b2cc38b002a0e0425c96 Mon Sep 17 00:00:00 2001 From: niry1 Date: Mon, 18 May 2026 10:43:09 +0300 Subject: [PATCH 14/50] Issue #106 - first commit for rolling context --- .github/workflows/c-cpp.yml | 4 ++-- README.md | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/.github/workflows/c-cpp.yml b/.github/workflows/c-cpp.yml index 8d4c777..95a5cc0 100644 --- a/.github/workflows/c-cpp.yml +++ b/.github/workflows/c-cpp.yml @@ -2,9 +2,9 @@ name: C/C++ CI on: push: - branches: [ dev, stable, dev-0.9.9, dev-1.0.0] + branches: [ dev, stable, dev-0.9.9, dev-1.0.0, dev-rolling-context] pull_request: - branches: [ dev, stable, dev-0.9.9, dev-1.0.0] + branches: [ dev, stable, dev-0.9.9, dev-1.0.0, dev-rolling-context] env: SPECS_BRANCH: ${{ github.event.pull_request.base.ref || github.ref_name }} diff --git a/README.md b/README.md index fb65923..ed28841 100644 --- a/README.md +++ b/README.md @@ -17,6 +17,7 @@ What's new: * Support Python in `MSBuild` builds * Added MSI and stand-alone Windows executable to release artifacts * Debugging aids for GDB + * Rolling context support *** 1-May-2026: Version 0.9.9 is here From 9ff353f6456481b1c56eab5837ab164cb54c14e9 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Tue, 19 May 2026 12:44:34 +0300 Subject: [PATCH 15/50] Issue #106 - Commit first implementation (#381) * Add the verbose indication of context size * Add gdb macro improvements * Separate input record from current record. We add an m_inputRecord to distinguish from m_ps, so that we have a stable string to assign to m_prevRecord even if m_ps was modified by CONTEXT --------- Co-authored-by: niry1 --- AGENTS.md | 58 ++++++++++++++++ agents.md | 9 --- manpage | 57 +++++++++++++++ specs/src/cli/tokens.cc | 33 +++++++++ specs/src/cli/tokens.h | 1 + specs/src/gdb/specs.gdb | 11 +++ specs/src/gdb/specs_gdb.py | 89 +++++++++++++++++++++--- specs/src/processing/ProcessingState.cc | 16 ++++- specs/src/processing/ProcessingState.h | 2 + specs/src/processing/Reader.cc | 92 ++++++++++++++++++++++++- specs/src/processing/Reader.h | 15 ++++ specs/src/specitems/item.h | 13 ++++ specs/src/specitems/specItems.cc | 38 ++++++++++ specs/src/specitems/specItems.h | 3 + specs/src/test/ProcessingTest.cc | 59 ++++++++++++++++ specs/src/test/specs.cc | 21 ++++++ specs/src/utils/alu.cc | 48 ++++++++++++- specs/src/utils/alu.h | 8 ++- specs/tests/pytest.py | 4 +- specs/tests/valgrind_unit_tests.py | 2 +- 20 files changed, 550 insertions(+), 29 deletions(-) create mode 100644 AGENTS.md delete mode 100644 agents.md diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..9753040 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,58 @@ +# Agent Guidelines for specs2016 + +## General Project Guidelines + +- When adding or removing tests in `specs/src/test/ProcessingTest.cc`, also update `count_processing_tests` in `specs/tests/valgrind_unit_tests.py`. +- When adding or removing tests in `specs/src/test/ALUUnitTest.cc`, also update `count_ALU_tests` in `specs/tests/valgrind_unit_tests.py`. +- When adding or removing tests in `specs/src/test/TokenTest.cc`, also update `count_token_tests` in `specs/tests/valgrind_unit_tests.py`. +- Keep any new `ProcessingTest.cc` regression cases appended at the end when possible, to avoid unnecessary renumbering. +- `MYASSERT(cond)` and `MYASSERT_WITH_MSG(cond, msg)` (defined in `specs/src/utils/ErrorReporting.h`) are **always-on runtime checks** that throw `SpecsException` via `MYTHROW`. They are *not* compiled out by `NDEBUG`. Do not add redundant `if`/`MYTHROW` guards that duplicate what a `MYASSERT` already covers. +- When building the project, always use the command-line `make clean all`. Do not skip the `clean` target, and do not use the `-j` argument. +- When changing the structure of any of the classes that have dump_ macros in `specs_gdb.py` and `specs.gdb` update the relevant macros as well. + +## Rolling Context Feature (Issue #106) + +### Token Normalization in tokens.cc + +The CONTEXT keyword requires special handling during token normalization (`normalizeTokenList` in `tokens.cc`): + +1. **Initial State**: When a CONTEXT token is first parsed, its `Literal()` field is empty. + +2. **Normalization Process** (lines 920-951 in tokens.cc): + - Check if the next token exists: `i+1 < tokList->size()` + - If yes, extract the integer offset from the next token (which can be a RANGE or a literal) + - Store the offset string in the CONTEXT token's literal field + - Erase the next token from the list + - If no next token exists, the literal remains empty + +3. **Compile-Time Validation** (lines 473-489 in specItems.cc): + - When `itemGroup::Compile` processes a CONTEXT token, it must validate that the literal is not empty + - Use `tokenVec[index].argIndex()` (not the vector index) for error messages + - Call `std::stoi(tokenVec[index].Literal())` only after validation + +### Key Insight + +The `if` statement at line 923 in tokens.cc checks `tok.Literal()==""` because: +- If there is no next token (`i+1 >= tokList->size()`), the condition is false +- The literal is never set +- The error is caught at compile time in specItems.cc with a proper error message + +**Example that triggers the error:** +``` +specs '1-* 1 CONTEXT' +``` +This produces: "CONTEXT at index X must be followed by an integer offset" + +### Testing + +All rolling context tests are in `specs/src/test/ProcessingTest.cc` (tests #229-#241). +Run with: `../exe/ProcessingTest` + +### Verbose Output + +When rolling context is used and verbose mode is enabled (`-v` flag), the buffer sizes are printed: +``` +Rolling context buffer sizes: forward= backward= +``` + +This is implemented in `specs/src/test/specs.cc` after the compilation phase. diff --git a/agents.md b/agents.md deleted file mode 100644 index 2824abb..0000000 --- a/agents.md +++ /dev/null @@ -1,9 +0,0 @@ -# Agents Notes - -- When adding or removing tests in `specs/src/test/ProcessingTest.cc`, also update `count_processing_tests` in `specs/tests/valgrind_unit_tests.py`. -- When adding or removing tests in `specs/src/test/ALUUnitTest.cc`, also update `count_ALU_tests` in `specs/tests/valgrind_unit_tests.py`. -- When adding or removing tests in `specs/src/test/TokenTest.cc`, also update `count_token_tests` in `specs/tests/valgrind_unit_tests.py`. -- Keep any new `ProcessingTest.cc` regression cases appended at the end when possible, to avoid unnecessary renumbering. -- `MYASSERT(cond)` and `MYASSERT_WITH_MSG(cond, msg)` (defined in `specs/src/utils/ErrorReporting.h`) are **always-on runtime checks** that throw `SpecsException` via `MYTHROW`. They are *not* compiled out by `NDEBUG`. Do not add redundant `if`/`MYTHROW` guards that duplicate what a `MYASSERT` already covers. -- When building the project, always use the command-line `make clean all`. Do not skip the `clean` target, and do not use the `-j` argument. -- When changing the structure of any of the classes that have dump_ macros in `specs_gdb.py` and `specs.gdb` update the relevant macros as well. diff --git a/manpage b/manpage index 7183d75..e1fea6e 100644 --- a/manpage +++ b/manpage @@ -357,6 +357,48 @@ Similarly: specs w1 a: SKIP-UNTIL "a=BEGIN" 1-* 1 .P skips records until the first record whose first word is BEGIN, then processes that record and all subsequent records normally. +.IP "CONTEXT" 3 +Temporarily changes the working string to a nearby record without consuming it. +.B CONTEXT +is followed by an integer offset: +.B CONTEXT 0 +resets the working string to the current record, +.B CONTEXT 1 +sets it to the next record, +.B CONTEXT -1 +sets it to the previous record, and so on. +The required buffer sizes are computed at parse time from the largest positive and negative offsets used. If the offset refers to a record beyond the beginning or end of the input, the working string is set to the empty string. +.P +.B CONTEXT +must not be abbreviated. It is not supported with threading or multiple input streams. +.P +Example: +.P + echo -e "A\\nB\\nC" | specs 1-* 1 CONTEXT 1 1-* NW +.P +produces: "A B", "B C", "C" -- each line shows the current record followed by the next. +.P +In expressions, the syntax +.B @+n +and +.B @-n +provides access to nearby records without changing the working string. For example, +.B @+1 +refers to the next record and +.B @-1 +refers to the previous record. +.B @+0 +and +.B @-0 +are equivalent to +.B @@ +(the current record). +.P +Example: +.P + echo -e "A\\nB\\nC" | specs print @+1 1 +.P +produces: "B", "C", "" -- each line shows the next record (empty for the last). .SS "MainOptions" These are optional spec units that appear at the beginning of the specification and modify the behavior of the entire specification. @@ -680,6 +722,21 @@ will output 4. The assignment is performed and the value stored in the counter. .IP "At-sign (@) Operator" 3 The at-sign allows the inclusion of user-defined and system-defined labels as strings in expressions. The double at-sign (@@) substitutes for the entire input record. +The syntax +.B @+n +and +.B @-n +(where n is a natural number) accesses nearby records from the rolling context buffer. +.B @+1 +is the next record, +.B @-1 +is the previous record. +.B @+0 +and +.B @-0 +are equivalent to +.B @@. +Out-of-bounds offsets return the empty string. The forward and backward buffer sizes are computed automatically at parse time. This feature requires non-threaded mode and a single input stream. .SS "Built-In Functions" .IP "abs(x)" 3 diff --git a/specs/src/cli/tokens.cc b/specs/src/cli/tokens.cc index b82d9eb..c627e1a 100644 --- a/specs/src/cli/tokens.cc +++ b/specs/src/cli/tokens.cc @@ -367,6 +367,7 @@ void parseSingleToken(std::vector *pVec, std::string arg, int argidx) SIMPLETOKEN(skip-until, SKIPUNTIL); SIMPLETOKEN(splitw, SPLITW); SIMPLETOKEN(splitf, SPLITF); + SIMPLETOKEN(context, CONTEXT); /* question mark to replace PRINT */ if (arg[0]=='?') { @@ -916,6 +917,38 @@ void normalizeTokenList(std::vector *tokList) tok.setLiteral(separator); break; } + case TokenListType__CONTEXT: + { + if (i+1 < tokList->size()) { + std::string offsetStr; + if (TokenListType__RANGE == nextTok.Type() && nextTok.Range() && nextTok.Range()->isSingleNumber()) { + offsetStr = std::to_string(nextTok.Range()->getSingleNumber()); + nextTok.deallocDynamic(); + } else if (mayBeLiteral(nextTok)) { + offsetStr = getLiteral(nextTok); + } else { + std::string err = "CONTEXT at index " + std::to_string(tok.argIndex()) + + " must be followed by an integer offset, got <" + nextTok.Orig() + ">"; + MYTHROW(err); + } + // Validate that offsetStr is a valid integer + try { + std::stoi(offsetStr); + } catch (...) { + std::string err = "CONTEXT at index " + std::to_string(tok.argIndex()) + + " must be followed by an integer offset, got <" + offsetStr + ">"; + MYTHROW(err); + } + tok.setLiteral(offsetStr); + tokList->erase(tokList->begin()+(i+1)); + } + if (tok.Literal()=="") { + std::string err = "CONTEXT at index " + std::to_string(tok.argIndex()) + + " must be followed by an integer offset"; + MYTHROW(err); + } + break; + } default: break; } diff --git a/specs/src/cli/tokens.h b/specs/src/cli/tokens.h index 296b00f..369e42a 100644 --- a/specs/src/cli/tokens.h +++ b/specs/src/cli/tokens.h @@ -79,6 +79,7 @@ X(SKIPWHILE, false, true) \ X(SPLITW, false, false) \ X(SPLITF, false, false) \ + X(CONTEXT, false, true) \ X(DUMMY, false, false) #define X(t,r,l) TokenListType__##t, diff --git a/specs/src/gdb/specs.gdb b/specs/src/gdb/specs.gdb index 36fc163..630ad02 100644 --- a/specs/src/gdb/specs.gdb +++ b/specs/src/gdb/specs.gdb @@ -88,6 +88,10 @@ define dump_break_item dump-break-item $arg0 end +define dump_context_item + dump-context-item $arg0 +end + define dump_select_item dump-select-item $arg0 end @@ -214,6 +218,13 @@ document bp_parseAluExpression Set a breakpoint on parseAluExpression to debug parsing of mathematical expressions. end +define bp_context_apply + break ContextItem::apply +end +document bp_context_apply + Set a breakpoint on ContextItem::apply to debug rolling context operations. +end + define bp_pfc_initialize break PythonFunctionCollection::Initialize end diff --git a/specs/src/gdb/specs_gdb.py b/specs/src/gdb/specs_gdb.py index af1ca04..9f0799c 100644 --- a/specs/src/gdb/specs_gdb.py +++ b/specs/src/gdb/specs_gdb.py @@ -605,6 +605,8 @@ def invoke(self, arg, from_tty): print(f" isBreak: {bool(is_break)}") except: pass + + print("----- end of 'Item' dump") except Exception as e: print(f"Error: {e}") @@ -613,9 +615,16 @@ class DumpDataField(gdb.Command): def __init__(self): super(DumpDataField, self).__init__("dump-data-field", gdb.COMMAND_DATA) + self.dump_item = None def invoke(self, arg, from_tty): try: + # First, call DumpItem to print base class fields + if self.dump_item is None: + self.dump_item = DumpItem() + self.dump_item.invoke(arg, from_tty) + + # Then print derived class fields val = gdb.parse_and_eval(arg) label = chr(int(val["m_label"])) if int(val["m_label"]) > 0 else "none" out_start = int(val["m_outStart"]) @@ -627,7 +636,6 @@ def invoke(self, arg, from_tty): conv_str = STRING_CONVERSIONS.get(conv, f"Unknown({conv})") align_str = OUTPUT_ALIGNMENT.get(align, f"Unknown({align})") - print(f"DataField @ {val.address}") print(f" m_label: {label}") print(f" m_outStart: {out_start}") print(f" m_maxLength: {max_len}") @@ -642,18 +650,23 @@ class DumpTokenItem(gdb.Command): def __init__(self): super(DumpTokenItem, self).__init__("dump-token-item", gdb.COMMAND_DATA) + self.dump_item = None def invoke(self, arg, from_tty): try: + # First, call DumpItem to print base class fields + if self.dump_item is None: + self.dump_item = DumpItem() + self.dump_item.invoke(arg, from_tty) + + # Then print derived class fields val = gdb.parse_and_eval(arg) token = deref_shared_ptr(val["mp_Token"]) if token: type_val = int(token["m_type"]) type_str = TOKEN_TYPES.get(type_val, f"Unknown({type_val})") - print(f"TokenItem @ {val.address}") print(f" Token type: {type_str}") else: - print(f"TokenItem @ {val.address}") print(f" mp_Token: ") except Exception as e: print(f"Error: {e}") @@ -663,13 +676,19 @@ class DumpSetItem(gdb.Command): def __init__(self): super(DumpSetItem, self).__init__("dump-set-item", gdb.COMMAND_DATA) + self.dump_item = None def invoke(self, arg, from_tty): try: + # First, call DumpItem to print base class fields + if self.dump_item is None: + self.dump_item = DumpItem() + self.dump_item.invoke(arg, from_tty) + + # Then print derived class fields val = gdb.parse_and_eval(arg) raw_expr = std_string_to_str(val["m_rawExpression"]) key = int(val["m_key"]) - print(f"SetItem @ {val.address}") print(f" m_rawExpression: \"{raw_expr}\"") print(f" m_key: {key}") except Exception as e: @@ -680,15 +699,21 @@ class DumpSkipItem(gdb.Command): def __init__(self): super(DumpSkipItem, self).__init__("dump-skip-item", gdb.COMMAND_DATA) + self.dump_item = None def invoke(self, arg, from_tty): try: + # First, call DumpItem to print base class fields + if self.dump_item is None: + self.dump_item = DumpItem() + self.dump_item.invoke(arg, from_tty) + + # Then print derived class fields val = gdb.parse_and_eval(arg) raw_expr = std_string_to_str(val["m_rawExpression"]) is_until = bool(val["m_bIsUntil"]) satisfied = bool(val["m_bSatisfied"]) skip_type = "SKIPUNTIL" if is_until else "SKIPWHILE" - print(f"SkipItem @ {val.address}") print(f" Type: {skip_type}") print(f" m_rawExpression: \"{raw_expr}\"") print(f" m_bSatisfied: {satisfied}") @@ -700,15 +725,21 @@ class DumpConditionItem(gdb.Command): def __init__(self): super(DumpConditionItem, self).__init__("dump-condition-item", gdb.COMMAND_DATA) + self.dump_item = None def invoke(self, arg, from_tty): try: + # First, call DumpItem to print base class fields + if self.dump_item is None: + self.dump_item = DumpItem() + self.dump_item.invoke(arg, from_tty) + + # Then print derived class fields val = gdb.parse_and_eval(arg) pred = int(val["m_pred"]) pred_str = CONDITION_PREDICATE.get(pred, f"Unknown({pred})") raw_expr = std_string_to_str(val["m_rawExpression"]) is_assn = bool(val["m_isAssignment"]) - print(f"ConditionItem @ {val.address}") print(f" m_pred: {pred_str}") print(f" m_rawExpression: \"{raw_expr}\"") print(f" m_isAssignment: {is_assn}") @@ -720,28 +751,61 @@ class DumpBreakItem(gdb.Command): def __init__(self): super(DumpBreakItem, self).__init__("dump-break-item", gdb.COMMAND_DATA) + self.dump_item = None def invoke(self, arg, from_tty): try: + # First, call DumpItem to print base class fields + if self.dump_item is None: + self.dump_item = DumpItem() + self.dump_item.invoke(arg, from_tty) + + # Then print derived class fields val = gdb.parse_and_eval(arg) ident = chr(int(val["m_identifier"])) - print(f"BreakItem @ {val.address}") print(f" m_identifier: {ident}") except Exception as e: print(f"Error: {e}") +class DumpContextItem(gdb.Command): + """Dump a ContextItem.""" + + def __init__(self): + super(DumpContextItem, self).__init__("dump-context-item", gdb.COMMAND_DATA) + self.dump_item = None + + def invoke(self, arg, from_tty): + try: + # First, call DumpItem to print base class fields + if self.dump_item is None: + self.dump_item = DumpItem() + self.dump_item.invoke(arg, from_tty) + + # Then print derived class fields + val = gdb.parse_and_eval(arg) + offset = int(val["m_offset"]) + print(f" m_offset: {offset}") + except Exception as e: + print(f"Error: {e}") + class DumpSelectItem(gdb.Command): """Dump a SelectItem.""" def __init__(self): super(DumpSelectItem, self).__init__("dump-select-item", gdb.COMMAND_DATA) + self.dump_item = None def invoke(self, arg, from_tty): try: + # First, call DumpItem to print base class fields + if self.dump_item is None: + self.dump_item = DumpItem() + self.dump_item.invoke(arg, from_tty) + + # Then print derived class fields val = gdb.parse_and_eval(arg) stream = int(val["m_stream"]) b_output = bool(val["bOutput"]) - print(f"SelectItem @ {val.address}") print(f" m_stream: {stream}") print(f" bOutput: {b_output}") except Exception as e: @@ -752,16 +816,22 @@ class DumpSplitItem(gdb.Command): def __init__(self): super(DumpSplitItem, self).__init__("dump-split-item", gdb.COMMAND_DATA) + self.dump_item = None def invoke(self, arg, from_tty): try: + # First, call DumpItem to print base class fields + if self.dump_item is None: + self.dump_item = DumpItem() + self.dump_item.invoke(arg, from_tty) + + # Then print derived class fields val = gdb.parse_and_eval(arg) is_field = bool(val["m_isField"]) sep = std_string_to_str(val["m_separator"]) splitting = bool(val["m_splitting"]) current_piece = int(val["m_currentPiece"]) split_type = "SPLITF" if is_field else "SPLITW" - print(f"SplitItem @ {val.address}") print(f" Type: {split_type}") print(f" m_separator: \"{sep}\"") print(f" m_splitting: {splitting}") @@ -1437,6 +1507,7 @@ def register_commands(): DumpSkipItem() DumpConditionItem() DumpBreakItem() + DumpContextItem() DumpSelectItem() DumpSplitItem() diff --git a/specs/src/processing/ProcessingState.cc b/specs/src/processing/ProcessingState.cc index 8940400..66f0a97 100644 --- a/specs/src/processing/ProcessingState.cc +++ b/specs/src/processing/ProcessingState.cc @@ -57,6 +57,7 @@ ProcessingState::ProcessingState() Reset(); m_ps = nullptr; m_prevPs = nullptr; + m_inputRecord = nullptr; m_inputStream = DEFAULT_READER_IDX; m_inputStreamChanged = false; m_bNoWrite = false; @@ -77,6 +78,7 @@ ProcessingState::ProcessingState(ProcessingState& ps) m_ExtraReads = 0; m_ps = nullptr; m_prevPs = nullptr; + m_inputRecord = nullptr; m_inputStation = STATION_FIRST; m_breakLevel = 0; m_inputStream = DEFAULT_READER_IDX; @@ -99,6 +101,7 @@ ProcessingState::ProcessingState(ProcessingState* pPS) m_ExtraReads = 0; m_ps = nullptr; m_prevPs = nullptr; + m_inputRecord = nullptr; m_inputStation = STATION_FIRST; m_breakLevel = 0; m_inputStream = DEFAULT_READER_IDX; @@ -118,12 +121,12 @@ ProcessingState::~ProcessingState() void ProcessingState::setString(PSpecString ps, bool bResetState) { - if (m_ps && ps!=m_ps) { - m_prevPs = m_ps; + if (m_inputRecord) { + m_prevPs = m_inputRecord; } else { - MYASSERT(m_prevPs==nullptr); m_prevPs = std::make_shared(); } + m_inputRecord = ps; m_ps = ps; m_wordCount = -1; m_fieldCount = -1; @@ -138,6 +141,13 @@ void ProcessingState::setStringInPlace(PSpecString ps) m_ps = ps; } +void ProcessingState::setContextString(PSpecString ps) +{ + m_ps = ps; + m_wordCount = -1; + m_fieldCount = -1; +} + void ProcessingState::setFirst() { if (m_inputStation != STATION_FIRST) { diff --git a/specs/src/processing/ProcessingState.h b/specs/src/processing/ProcessingState.h index 2940c83..d1ebd8d 100644 --- a/specs/src/processing/ProcessingState.h +++ b/specs/src/processing/ProcessingState.h @@ -87,6 +87,7 @@ class ProcessingState : public stateQueryAgent { void setFirst(); void setSecond(); void setStream(int i); + void setContextString(PSpecString ps); int getActiveInputStation() { return m_inputStation; } PSpecString currRecord() override { return (m_inputStation==STATION_FIRST) ? m_ps : m_prevPs; } bool recordNotAvailable() { return nullptr==currRecord(); } @@ -113,6 +114,7 @@ class ProcessingState : public stateQueryAgent { std::string m_fieldSeparator; PSpecString m_ps; // The current record PSpecString m_prevPs; // The previous record + PSpecString m_inputRecord; // The real input record (unaffected by CONTEXT) int m_wordCount; int m_fieldCount; unsigned int m_CycleCounter; diff --git a/specs/src/processing/Reader.cc b/specs/src/processing/Reader.cc index 161e80c..4688165 100644 --- a/specs/src/processing/Reader.cc +++ b/specs/src/processing/Reader.cc @@ -5,6 +5,7 @@ #include "Reader.h" uint64_t g_readRecordCounter = 0; +Reader* g_pReader = nullptr; void ReadAllRecordsIntoReaderQueue(Reader* r) { @@ -108,6 +109,12 @@ void Reader::Begin() { mp_thread = std::unique_ptr(new std::thread(ReadAllRecordsIntoReaderQueue, this)); } +PSpecString Reader::peek(int offset) +{ + MYTHROW("Rolling context is not supported for this reader type"); + return nullptr; +} + StandardReader::StandardReader() { m_NeedToClose = false; @@ -115,6 +122,10 @@ StandardReader::StandardReader() { m_buffer = nullptr; m_recfm = RECFM_DELIMITED; m_lineDelimiter = 0; + m_forwardContextSize = 0; + m_backwardContextSize = 0; + m_currentRecord = nullptr; + m_contextInitialized = false; } StandardReader::StandardReader(std::istream* f) { @@ -128,6 +139,10 @@ StandardReader::StandardReader(std::istream* f) { m_buffer = nullptr; m_recfm = RECFM_DELIMITED; m_lineDelimiter = 0; + m_forwardContextSize = 0; + m_backwardContextSize = 0; + m_currentRecord = nullptr; + m_contextInitialized = false; } StandardReader::StandardReader(std::string& fn) { @@ -142,6 +157,10 @@ StandardReader::StandardReader(std::string& fn) { m_buffer = nullptr; m_recfm = RECFM_DELIMITED; m_lineDelimiter = 0; + m_forwardContextSize = 0; + m_backwardContextSize = 0; + m_currentRecord = nullptr; + m_contextInitialized = false; } StandardReader::StandardReader(pipeType pipe) { @@ -151,6 +170,10 @@ StandardReader::StandardReader(pipeType pipe) { m_buffer = nullptr; m_recfm = RECFM_DELIMITED; m_lineDelimiter = 0; + m_forwardContextSize = 0; + m_backwardContextSize = 0; + m_currentRecord = nullptr; + m_contextInitialized = false; } StandardReader::~StandardReader() { @@ -181,11 +204,68 @@ void StandardReader::setLineDelimiter(char c) m_lineDelimiter = c; } +void StandardReader::setContextSizes(unsigned int forward, unsigned int backward) +{ + m_forwardContextSize = forward; + m_backwardContextSize = backward; +} + +PSpecString StandardReader::peek(int offset) +{ + if (offset == 0) { + return m_currentRecord ? m_currentRecord : std::make_shared(); + } + if (offset < 0) { + unsigned int idx = (unsigned int)(-offset) - 1; + if (idx >= m_backwardBuffer.size()) return std::make_shared(); + return m_backwardBuffer[m_backwardBuffer.size() - 1 - idx]; + } + // offset > 0 + unsigned int idx = (unsigned int)offset - 1; + if (idx >= m_forwardBuffer.size()) return std::make_shared(); + return m_forwardBuffer[idx]; +} + bool StandardReader::endOfSource() { - return m_bAbort || m_EOF; + if (m_bAbort) return true; + if (m_contextInitialized && !m_forwardBuffer.empty()) return false; + return m_EOF; } PSpecString StandardReader::getNextRecord() { + if (m_forwardContextSize == 0 && m_backwardContextSize == 0) + return getNextRecordInternal(); + + if (!m_contextInitialized) { + m_currentRecord = getNextRecordInternal(); + if (!m_currentRecord) return nullptr; + for (unsigned int i = 0; i < m_forwardContextSize; i++) { + PSpecString rec = getNextRecordInternal(); + if (!rec) break; + m_forwardBuffer.push_back(rec); + } + m_contextInitialized = true; + return m_currentRecord; + } + + // Shift window forward + m_backwardBuffer.push_back(m_currentRecord); + if (m_backwardBuffer.size() > m_backwardContextSize) + m_backwardBuffer.pop_front(); + + if (!m_forwardBuffer.empty()) { + m_currentRecord = m_forwardBuffer.front(); + m_forwardBuffer.pop_front(); + PSpecString rec = getNextRecordInternal(); + if (rec) m_forwardBuffer.push_back(rec); + } else { + m_currentRecord = getNextRecordInternal(); + } + + return m_currentRecord; +} + +PSpecString StandardReader::getNextRecordInternal() { std::string line; bool ok; switch (m_recfm) { @@ -310,6 +390,16 @@ void TestReader::InsertString(PSpecString ps) mp_arr[m_count++] = ps; } +PSpecString TestReader::peek(int offset) +{ + // m_idx points to the *next* record to read, so current record is m_idx-1 + int target = int(m_idx) - 1 + offset; + if (target < 0 || target >= int(m_count)) { + return std::make_shared(); // empty string for out-of-bounds + } + return mp_arr[target]; +} + // #include // for memset // #include "utils/ErrorReporting.h" diff --git a/specs/src/processing/Reader.h b/specs/src/processing/Reader.h index a718f7d..e398b12 100644 --- a/specs/src/processing/Reader.h +++ b/specs/src/processing/Reader.h @@ -1,6 +1,7 @@ #ifndef SPECS2016__PROCESSING__READER__H #define SPECS2016__PROCESSING__READER__H +#include #include #include #include "utils/StringQueue.h" @@ -39,6 +40,8 @@ class Reader { virtual void setLineDelimiter(char c) { MYTHROW("Reader::setLineDelimiter: should not be called"); } + virtual PSpecString peek(int offset); + virtual void setContextSizes(unsigned int forward, unsigned int backward) {} protected: StringQueue m_queue; std::unique_ptr mp_thread; @@ -61,6 +64,7 @@ class TestReader : public Reader { bool endOfSource() override {return m_bAbort || (m_idx >= m_count); } PSpecString getNextRecord() override {return mp_arr[m_idx++];} PSpecString get(classifyingTimer& tmr, unsigned int& _readerCounter) override {return getNextRecord();} + PSpecString peek(int offset) override; private: PSpecString *mp_arr; size_t m_count; @@ -86,7 +90,10 @@ class StandardReader : public Reader { PSpecString getNextRecord() override; void setFormatFixed(unsigned int lrecl, bool blocked) override; void setLineDelimiter(char c) override; + PSpecString peek(int offset) override; + void setContextSizes(unsigned int forward, unsigned int backward) override; private: + PSpecString getNextRecordInternal(); std::shared_ptr m_File; pipeType m_pipe; char* m_buffer; @@ -95,6 +102,13 @@ class StandardReader : public Reader { recordFormat m_recfm; unsigned int m_lrecl; char m_lineDelimiter; + // Rolling context buffers + unsigned int m_forwardContextSize; + unsigned int m_backwardContextSize; + std::deque m_forwardBuffer; + std::deque m_backwardBuffer; + PSpecString m_currentRecord; + bool m_contextInitialized; }; typedef std::shared_ptr PStandardReader; @@ -129,5 +143,6 @@ class multiReader : public Reader { typedef std::shared_ptr PMultiReader; +extern Reader* g_pReader; #endif diff --git a/specs/src/specitems/item.h b/specs/src/specitems/item.h index fcb77c7..e90282e 100644 --- a/specs/src/specitems/item.h +++ b/specs/src/specitems/item.h @@ -390,4 +390,17 @@ class SplitItem : public Item { typedef std::shared_ptr PSplitItem; +class ContextItem : public Item { +public: + explicit ContextItem(int offset); + ~ContextItem() override {} + std::string Debug() override; + ApplyRet apply(ProcessingState& pState, StringBuilder* pSB) override; + bool readsLines() override { return true; } +private: + int m_offset; +}; + +typedef std::shared_ptr PContextItem; + #endif diff --git a/specs/src/specitems/specItems.cc b/specs/src/specitems/specItems.cc index 9ad0ac0..6de5207 100644 --- a/specs/src/specitems/specItems.cc +++ b/specs/src/specitems/specItems.cc @@ -12,6 +12,9 @@ bool g_keep_suppressed_record = false; extern uint64_t g_readRecordCounter; unsigned int g_WhileGuardLimit = 5000; +unsigned int g_forwardContext = 0; +unsigned int g_backwardContext = 0; + struct predicateStackItem { PConditionItem pred; unsigned int argIndex; @@ -467,6 +470,23 @@ void itemGroup::Compile(std::vector &tokenVec, unsigned int& index) addItem(pItem); break; } + case TokenListType__CONTEXT: + { + if (tokenVec[index].Literal().empty()) { + std::string err = "CONTEXT at index " + std::to_string(tokenVec[index].argIndex()) + + " must be followed by an integer offset"; + MYTHROW(err); + } + int offset = std::stoi(tokenVec[index].Literal()); + auto pItem = std::make_shared(offset); + addItem(pItem); + if (offset > 0 && (unsigned int)offset > g_forwardContext) + g_forwardContext = (unsigned int)offset; + if (offset < 0 && (unsigned int)(-offset) > g_backwardContext) + g_backwardContext = (unsigned int)(-offset); + index++; + break; + } case TokenListType__REQUIRES: { if (!configSpecLiteralExists(tokenVec[index].Literal())) { @@ -1264,6 +1284,24 @@ ApplyRet SelectItem::apply(ProcessingState& pState, StringBuilder* pSB) return ApplyRet__Continue; } +ContextItem::ContextItem(int offset) : m_offset(offset) {} + +std::string ContextItem::Debug() +{ + std::string ret = "CONTEXT "; + if (m_offset >= 0) ret += "+"; + ret += std::to_string(m_offset); + return ret; +} + +ApplyRet ContextItem::apply(ProcessingState& pState, StringBuilder* pSB) +{ + MYASSERT_WITH_MSG(g_pReader != nullptr, "Rolling context requires a reader"); + PSpecString ps = g_pReader->peek(m_offset); + pState.setContextString(ps); + return ApplyRet__Continue; +} + #ifdef ndef std::ostream& operator<< (std::ostream& os, const SpecString &str) { diff --git a/specs/src/specitems/specItems.h b/specs/src/specitems/specItems.h index 683bb4b..89b016f 100644 --- a/specs/src/specitems/specItems.h +++ b/specs/src/specitems/specItems.h @@ -11,6 +11,9 @@ #define MAX_DEPTH_CONDITION_STATEMENTS 64 +extern unsigned int g_forwardContext; +extern unsigned int g_backwardContext; + class itemGroup { public: itemGroup(); diff --git a/specs/src/test/ProcessingTest.cc b/specs/src/test/ProcessingTest.cc index 57ac54e..613a063 100644 --- a/specs/src/test/ProcessingTest.cc +++ b/specs/src/test/ProcessingTest.cc @@ -8,6 +8,7 @@ #include "processing/Config.h" #include "processing/ProcessingState.h" #include "processing/StringBuilder.h" +#include "processing/Reader.h" extern ALUCounters g_counters; extern char g_printonly_rule; @@ -109,8 +110,11 @@ PSpecString runTestOnExample(const char* _specList, const char* _example) g_counters.clearAll(); g_keep_suppressed_record = false; g_printonly_rule = PRINTONLY_PRINTALL; + g_forwardContext = 0; + g_backwardContext = 0; TestReader tRead(100); + g_pReader = &tRead; unsigned int readerCounter = 1; char* example = strdup(_example); char* example_ctx = example; @@ -206,6 +210,7 @@ PSpecString runTestOnExample(const char* _specList, const char* _example) } end: + g_pReader = nullptr; free(example); while (!vec.empty()) { vec[0].deallocDynamic(); @@ -869,6 +874,60 @@ int main(int argc, char** argv) " PRINT 'exact(max(a))' NW"; VERIFY2(spec, "1.5\n2.5\n3.5", "0 0 0"); // TEST #228 + // === Rolling Context tests === + + // CONTEXT 0 resets to current record + spec = "CONTEXT 0 1-* 1"; + VERIFY2(spec, "alpha\nbeta\ngamma", "alpha\nbeta\ngamma"); // TEST #229 + + // CONTEXT +1 peeks at next record + spec = "1-* 1 CONTEXT 1 1-* NW"; + VERIFY2(spec, "alpha\nbeta\ngamma", "alpha beta\nbeta gamma\ngamma"); // TEST #230 + + // CONTEXT -1 peeks at previous record + spec = "1-* 1 CONTEXT -1 1-* NW"; + VERIFY2(spec, "alpha\nbeta\ngamma", "alpha\nbeta alpha\ngamma beta"); // TEST #231 + + // @+1 in expression peeks at next record + spec = "print @+1 1"; + VERIFY2(spec, "alpha\nbeta\ngamma", "beta\ngamma\n"); // TEST #232 + + // @-1 in expression peeks at previous record + spec = "print @-1 1"; + VERIFY2(spec, "alpha\nbeta\ngamma", "\nalpha\nbeta"); // TEST #233 + + // @+0 is the same as @@ + spec = "print @+0 1"; + VERIFY2(spec, "alpha\nbeta\ngamma", "alpha\nbeta\ngamma"); // TEST #234 + + // @-0 is the same as @@ + spec = "print @-0 1"; + VERIFY2(spec, "alpha\nbeta\ngamma", "alpha\nbeta\ngamma"); // TEST #235 + + // CONTEXT with larger forward offset + spec = "1-* 1 CONTEXT 2 1-* NW"; + VERIFY2(spec, "A\nB\nC\nD\nE", "A C\nB D\nC E\nD\nE"); // TEST #236 + + // CONTEXT with larger backward offset + spec = "1-* 1 CONTEXT -2 1-* NW"; + VERIFY2(spec, "A\nB\nC\nD\nE", "A\nB\nC A\nD B\nE C"); // TEST #237 + + // Combined forward and backward in one spec + spec = "CONTEXT -1 1-* 1 CONTEXT 0 1-* NW CONTEXT 1 1-* NW"; + VERIFY2(spec, "A\nB\nC", "A B\nA B C\nB C"); // TEST #238 + + // @+n in expression with function + spec = "PRINT 'length(@+1)' 1"; + VERIFY2(spec, "AB\nCDE\nF", "3\n1\n0"); // TEST #239 + + // Out-of-range forward context returns empty string + spec = "print @+5 1"; + VERIFY2(spec, "only", ""); // TEST #240 + + // Out-of-range backward context returns empty string + spec = "print @-5 1"; + VERIFY2(spec, "only", ""); // TEST #241 + if (errorCount) { std::cout << '\n' << errorCount << '/' << testCount << " tests failed.\n"; std::cout << "Failed tests: "; diff --git a/specs/src/test/specs.cc b/specs/src/test/specs.cc index b05d6e0..81fcfe4 100644 --- a/specs/src/test/specs.cc +++ b/specs/src/test/specs.cc @@ -270,6 +270,21 @@ int main (int argc, char** argv) exit (0); } + // Check for rolling context incompatibilities + if (g_forwardContext > 0 || g_backwardContext > 0) { + if (g_bThreaded) { + std::cerr << "Error: Rolling context (CONTEXT / @+n / @-n) is not supported with threading.\n"; + exit(0); + } + if (anyNonPrimaryInputStreamDefined()) { + std::cerr << "Error: Rolling context (CONTEXT / @+n / @-n) is not supported with multiple input streams.\n"; + exit(0); + } + if (g_bVerbose) { + std::cerr << "specs: Using a " << g_forwardContext + g_backwardContext + 1 << "-record rolling context: " << g_forwardContext << " records forward and " << g_backwardContext << " records backward.\n"; + } + } + // After the compilation, the token vector contents are no longer necessary for (size_t i=0; i 0 || g_backwardContext > 0) { + pRd->setContextSizes(g_forwardContext, g_backwardContext); + } + g_pReader = pRd.get(); + pRd->Begin(); timer.changeClass(timeClassProcessing); @@ -395,6 +415,7 @@ int main (int argc, char** argv) return -4; } + g_pReader = nullptr; pRd->End(); readLines = pRd->countRead(); usedLines = pRd->countUsed(); diff --git a/specs/src/utils/alu.cc b/specs/src/utils/alu.cc index 6f4325a..e79ae5e 100644 --- a/specs/src/utils/alu.cc +++ b/specs/src/utils/alu.cc @@ -16,8 +16,11 @@ #include "alu.h" #include "aluFunctions.h" #include "processing/Config.h" // for configured literals +#include "processing/Reader.h" // for g_pReader extern stateQueryAgent* g_pStateQueryAgent; +extern unsigned int g_forwardContext; +extern unsigned int g_backwardContext; void ALUValue::set(std::string& s) { @@ -861,12 +864,31 @@ PValue AluAssnOperator::computeAppnd(PValue operand, PValue prevOp) void AluInputRecord::_serialize(std::ostream& os) const { - os << "@@"; + if (m_offset == 0) { + os << "@@"; + } else if (m_offset > 0) { + os << "@+" << m_offset; + } else { + os << "@" << m_offset; + } +} + +std::string AluInputRecord::_identify() +{ + if (m_offset == 0) return "@@"; + if (m_offset > 0) return "@+" + std::to_string(m_offset); + return "@" + std::to_string(m_offset); } PValue AluInputRecord::evaluate() { - PSpecString ps = g_pStateQueryAgent->getFromTo(1,-1); + PSpecString ps; + if (m_offset == 0) { + ps = g_pStateQueryAgent->getFromTo(1,-1); + } else { + MYASSERT_WITH_MSG(g_pReader != nullptr, "Rolling context requires a reader"); + ps = g_pReader->peek(m_offset); + } PValue ret; if (ps) { ret = mkValue2(ps->data(), int(ps->length())); @@ -1233,6 +1255,28 @@ bool parseAluExpression(std::string& s, AluVec& vec) continue; } + // Rolling context: @+n or @-n + if (*c=='@' && (c[1]=='+' || c[1]=='-') && isDigit(c[2])) { + char sign = c[1]; + char* tokEnd = c + 2; + while (tokEnd(offset); + vec.push_back(pUnit); + prevUnitType = pUnit->type(); + c = tokEnd; + mayBeStart = false; + // Update rolling context size globals + if (offset > 0 && (unsigned int)offset > g_forwardContext) { + g_forwardContext = (unsigned int)offset; + } else if (offset < 0 && (unsigned int)(-offset) > g_backwardContext) { + g_backwardContext = (unsigned int)(-offset); + } + continue; + } + // Also a configured string if (*c=='@' && isFirstCharInIdentifier(c[1])) { char* tokEnd = ++c; diff --git a/specs/src/utils/alu.h b/specs/src/utils/alu.h index 0c21e25..3a180e2 100644 --- a/specs/src/utils/alu.h +++ b/specs/src/utils/alu.h @@ -285,13 +285,17 @@ class AluFunction : public AluUnit { class AluInputRecord : public AluUnit { public: - AluInputRecord() {} + AluInputRecord() : m_offset(0) {} + explicit AluInputRecord(int offset) : m_offset(offset) {} ~AluInputRecord() override {} void _serialize(std::ostream& os) const override; - std::string _identify() override {return "@@";} + std::string _identify() override; AluUnitType type() override {return UT_InputRecord;} PValue evaluate() override; bool requiresRead() override {return true;} + int offset() const {return m_offset;} +private: + int m_offset; }; class AluOtherToken : public AluUnit { diff --git a/specs/tests/pytest.py b/specs/tests/pytest.py index c313673..39b02a2 100644 --- a/specs/tests/pytest.py +++ b/specs/tests/pytest.py @@ -83,7 +83,7 @@ def set_localfuncs(lf): # But if we force it... sys.stdout.write("Test 04 (bad file; non-python function; force) -- ") ret = run_cmd('print "sqrt(81)" 1', True) -if ret=="Python Interface: Error loading local functions" or ret=="SyntaxError: invalid syntax": +if ret=="Python Interface: Error loading local functions" or ret.startswith("SyntaxError: invalid syntax"): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") @@ -92,7 +92,7 @@ def set_localfuncs(lf): # Or call a non-built-in function... sys.stdout.write("Test 05 (bad file; unknown function) -- ") ret = run_cmd('print "kuku(16)" 1') -if ret=="Python Interface: Error loading local functions" or ret=="SyntaxError: invalid syntax": +if ret=="Python Interface: Error loading local functions" or ret.startswith("SyntaxError: invalid syntax"): sys.stdout.write("OK\n") else: sys.stdout.write("Not OK: <"+ret+">\n") diff --git a/specs/tests/valgrind_unit_tests.py b/specs/tests/valgrind_unit_tests.py index 20dbd9a..4ae8837 100644 --- a/specs/tests/valgrind_unit_tests.py +++ b/specs/tests/valgrind_unit_tests.py @@ -1,7 +1,7 @@ import sys, memcheck, argparse count_ALU_tests = 825 -count_processing_tests = 228 +count_processing_tests = 241 count_token_tests = 17 # Parse the one command line options From 9b3b189f5cca2d0c5b01115e21903d8feba60f47 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Tue, 19 May 2026 16:04:46 +0300 Subject: [PATCH 16/50] Issue #382 - Make @@ work correctly when CONTEXT is used (#383) --- specs/src/processing/ProcessingState.h | 1 + specs/src/test/ProcessingTest.cc | 4 ++++ specs/src/utils/alu.cc | 2 +- specs/src/utils/aluFunctions.h | 1 + specs/tests/valgrind_unit_tests.py | 2 +- 5 files changed, 8 insertions(+), 2 deletions(-) diff --git a/specs/src/processing/ProcessingState.h b/specs/src/processing/ProcessingState.h index d1ebd8d..21fbc72 100644 --- a/specs/src/processing/ProcessingState.h +++ b/specs/src/processing/ProcessingState.h @@ -90,6 +90,7 @@ class ProcessingState : public stateQueryAgent { void setContextString(PSpecString ps); int getActiveInputStation() { return m_inputStation; } PSpecString currRecord() override { return (m_inputStation==STATION_FIRST) ? m_ps : m_prevPs; } + PSpecString inputRecord() override { return m_inputRecord; } bool recordNotAvailable() { return nullptr==currRecord(); } bool inputStreamHasChanged() { return m_inputStreamChanged; } void resetInputStreamFlag() { m_inputStreamChanged = false; } diff --git a/specs/src/test/ProcessingTest.cc b/specs/src/test/ProcessingTest.cc index 613a063..c61c1f0 100644 --- a/specs/src/test/ProcessingTest.cc +++ b/specs/src/test/ProcessingTest.cc @@ -928,6 +928,10 @@ int main(int argc, char** argv) spec = "print @-5 1"; VERIFY2(spec, "only", ""); // TEST #241 + // @@ returns the real input record, not the CONTEXT-modified one + spec = "CONTEXT -1 PRINT '@@' 1 WRITE PRINT '@-1' 1 WRITE"; + VERIFY2(spec, "alpha\nbeta\ngamma", "alpha\n\nbeta\nalpha\ngamma\nbeta"); // TEST #242 + if (errorCount) { std::cout << '\n' << errorCount << '/' << testCount << " tests failed.\n"; std::cout << "Failed tests: "; diff --git a/specs/src/utils/alu.cc b/specs/src/utils/alu.cc index e79ae5e..55b5596 100644 --- a/specs/src/utils/alu.cc +++ b/specs/src/utils/alu.cc @@ -884,7 +884,7 @@ PValue AluInputRecord::evaluate() { PSpecString ps; if (m_offset == 0) { - ps = g_pStateQueryAgent->getFromTo(1,-1); + ps = g_pStateQueryAgent->inputRecord(); } else { MYASSERT_WITH_MSG(g_pReader != nullptr, "Rolling context requires a reader"); ps = g_pReader->peek(m_offset); diff --git a/specs/src/utils/aluFunctions.h b/specs/src/utils/aluFunctions.h index ba4ab00..d91dd4d 100644 --- a/specs/src/utils/aluFunctions.h +++ b/specs/src/utils/aluFunctions.h @@ -427,6 +427,7 @@ class stateQueryAgent { return getFromTo(int(from), int(to)); } virtual PSpecString currRecord() = 0; + virtual PSpecString inputRecord() = 0; virtual bool isRunIn() = 0; virtual bool isRunOut() = 0; virtual ALUInt getRecordCount() = 0; diff --git a/specs/tests/valgrind_unit_tests.py b/specs/tests/valgrind_unit_tests.py index 4ae8837..466765a 100644 --- a/specs/tests/valgrind_unit_tests.py +++ b/specs/tests/valgrind_unit_tests.py @@ -1,7 +1,7 @@ import sys, memcheck, argparse count_ALU_tests = 825 -count_processing_tests = 241 +count_processing_tests = 242 count_token_tests = 17 # Parse the one command line options From 25b555bc6f88faec6af289758bdaa225f98bde4d Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Tue, 19 May 2026 19:23:53 +0300 Subject: [PATCH 17/50] Issue #384 - Add the ctxrecno function (#385) --- manpage | 9 +++++++++ specs/docs/alu_adv.md | 1 + specs/src/processing/ProcessingState.cc | 7 ++++++- specs/src/processing/ProcessingState.h | 4 +++- specs/src/specitems/specItems.cc | 2 +- specs/src/test/ProcessingTest.cc | 22 ++++++++++++++++++++++ specs/src/utils/aluFunctions.cc | 5 +++++ specs/src/utils/aluFunctions.h | 3 +++ specs/tests/valgrind_unit_tests.py | 4 ++-- 9 files changed, 52 insertions(+), 5 deletions(-) diff --git a/manpage b/manpage index e1fea6e..65be818 100644 --- a/manpage +++ b/manpage @@ -912,6 +912,15 @@ or spec units, then this number will be equal to the result of .B number(). Otherwise, it will be higher. +.IP "ctxrecno()" 3 +Returns the record number of the record that input parts work on. This is similar to +.B recno(), +but considers rolling context, which +.B recno() +does not. When no +.I CONTEXT +is active, returns the same value as +.B recno(). .IP "record()" 3 Returns the entire input record. Equivalent to .B range(1,-1) diff --git a/specs/docs/alu_adv.md b/specs/docs/alu_adv.md index b2dba3c..e81562e 100644 --- a/specs/docs/alu_adv.md +++ b/specs/docs/alu_adv.md @@ -164,6 +164,7 @@ All three regular expression functions have an argument called `matchFlags`. Thi | `number()` | Returns the number of processing cycles we have already gone through. Unless `READ` or `READSTOP` are used, this will be equal to the number of records read so far. | | `range(n,m)` | Returns the substring from the *n*-th character (default first) to the *m*-th character (default last) | | `recno()` | Returns the number of the currently read record. If the `READ` or `READSTOP` keywords are used this may be greater than `number()` | +| `ctxrecno()` | Returns the record number of the record that input parts work on. This is similar to `recno()`, but considers rolling context, which `recno()` does not. | | `record()` | Returns the entire input record | | `word(n)` | Returns the *n*-th word | | `wordrange(n,m)` | Returns the substring from the *n*-th word (default first) to the *m*-th word (default last) | diff --git a/specs/src/processing/ProcessingState.cc b/specs/src/processing/ProcessingState.cc index 66f0a97..61c4e5d 100644 --- a/specs/src/processing/ProcessingState.cc +++ b/specs/src/processing/ProcessingState.cc @@ -49,6 +49,7 @@ void ProcessingState::Reset() m_wordCount = -1; m_CycleCounter = 0; m_ExtraReads = 0; + m_contextOffset = 0; m_inputStation = STATION_FIRST; m_breakLevel = 0; } @@ -76,6 +77,7 @@ ProcessingState::ProcessingState(ProcessingState& ps) m_wordCount = 0; m_CycleCounter = 0; m_ExtraReads = 0; + m_contextOffset = 0; m_ps = nullptr; m_prevPs = nullptr; m_inputRecord = nullptr; @@ -99,6 +101,7 @@ ProcessingState::ProcessingState(ProcessingState* pPS) m_wordCount = 0; m_CycleCounter = 0; m_ExtraReads = 0; + m_contextOffset = 0; m_ps = nullptr; m_prevPs = nullptr; m_inputRecord = nullptr; @@ -130,6 +133,7 @@ void ProcessingState::setString(PSpecString ps, bool bResetState) m_ps = ps; m_wordCount = -1; m_fieldCount = -1; + m_contextOffset = 0; if (bResetState) { fieldIdentifierClear(); resetBreaks(); @@ -141,11 +145,12 @@ void ProcessingState::setStringInPlace(PSpecString ps) m_ps = ps; } -void ProcessingState::setContextString(PSpecString ps) +void ProcessingState::setContextString(PSpecString ps, int offset) { m_ps = ps; m_wordCount = -1; m_fieldCount = -1; + m_contextOffset = offset; } void ProcessingState::setFirst() diff --git a/specs/src/processing/ProcessingState.h b/specs/src/processing/ProcessingState.h index 21fbc72..b25b48b 100644 --- a/specs/src/processing/ProcessingState.h +++ b/specs/src/processing/ProcessingState.h @@ -52,6 +52,7 @@ class ProcessingState : public stateQueryAgent { bool isRunIn() override { return (m_CycleCounter==1); } bool isRunOut() override { return (m_ps==nullptr); } // NOTE: will return true before first record ALUInt getRecordCount() override { return ALUInt(m_CycleCounter + m_ExtraReads); } + ALUInt getContextOffset() override { return ALUInt(m_contextOffset); } ALUInt getIterationCount() override { return ALUInt(m_CycleCounter); } bool breakEstablished(char id) override; PAluValueStats valueStatistics(char id) override; @@ -87,7 +88,7 @@ class ProcessingState : public stateQueryAgent { void setFirst(); void setSecond(); void setStream(int i); - void setContextString(PSpecString ps); + void setContextString(PSpecString ps, int offset = 0); int getActiveInputStation() { return m_inputStation; } PSpecString currRecord() override { return (m_inputStation==STATION_FIRST) ? m_ps : m_prevPs; } PSpecString inputRecord() override { return m_inputRecord; } @@ -120,6 +121,7 @@ class ProcessingState : public stateQueryAgent { int m_fieldCount; unsigned int m_CycleCounter; unsigned int m_ExtraReads; + int m_contextOffset; std::vector m_wordStart; std::vector m_wordEnd; std::vector m_fieldStart; diff --git a/specs/src/specitems/specItems.cc b/specs/src/specitems/specItems.cc index 6de5207..ae74bd8 100644 --- a/specs/src/specitems/specItems.cc +++ b/specs/src/specitems/specItems.cc @@ -1298,7 +1298,7 @@ ApplyRet ContextItem::apply(ProcessingState& pState, StringBuilder* pSB) { MYASSERT_WITH_MSG(g_pReader != nullptr, "Rolling context requires a reader"); PSpecString ps = g_pReader->peek(m_offset); - pState.setContextString(ps); + pState.setContextString(ps, m_offset); return ApplyRet__Continue; } diff --git a/specs/src/test/ProcessingTest.cc b/specs/src/test/ProcessingTest.cc index c61c1f0..cde37e8 100644 --- a/specs/src/test/ProcessingTest.cc +++ b/specs/src/test/ProcessingTest.cc @@ -932,6 +932,28 @@ int main(int argc, char** argv) spec = "CONTEXT -1 PRINT '@@' 1 WRITE PRINT '@-1' 1 WRITE"; VERIFY2(spec, "alpha\nbeta\ngamma", "alpha\n\nbeta\nalpha\ngamma\nbeta"); // TEST #242 + // === ctxrecno tests === + + // ctxrecno without CONTEXT returns same as recno + spec = "PRINT 'ctxrecno()' 1"; + VERIFY2(spec, "a\nb\nc", "1\n2\n3"); // TEST #243 + + // ctxrecno with CONTEXT 1 returns recno + 1 + spec = "CONTEXT 1 PRINT 'ctxrecno()' 1"; + VERIFY2(spec, "a\nb\nc", "2\n3\n4"); // TEST #244 + + // ctxrecno with CONTEXT -1 returns recno - 1 + spec = "CONTEXT -1 PRINT 'ctxrecno()' 1"; + VERIFY2(spec, "a\nb\nc", "0\n1\n2"); // TEST #245 + + // ctxrecno with CONTEXT 0 returns same as recno + spec = "CONTEXT 0 PRINT 'ctxrecno()' 1"; + VERIFY2(spec, "a\nb\nc", "1\n2\n3"); // TEST #246 + + // ctxrecno resets after CONTEXT changes + spec = "PRINT 'ctxrecno()' 1 CONTEXT 1 PRINT 'ctxrecno()' NW"; + VERIFY2(spec, "a\nb\nc", "1 2\n2 3\n3 4"); // TEST #247 + if (errorCount) { std::cout << '\n' << errorCount << '/' << testCount << " tests failed.\n"; std::cout << "Failed tests: "; diff --git a/specs/src/utils/aluFunctions.cc b/specs/src/utils/aluFunctions.cc index e649754..59c5fc2 100644 --- a/specs/src/utils/aluFunctions.cc +++ b/specs/src/utils/aluFunctions.cc @@ -414,6 +414,11 @@ PValue AluFunc_recno() return mkValue(g_pStateQueryAgent->getRecordCount()); } +PValue AluFunc_ctxrecno() +{ + return mkValue(g_pStateQueryAgent->getRecordCount() + g_pStateQueryAgent->getContextOffset()); +} + PValue AluFunc_eof() { bool isRunOut = g_pStateQueryAgent->isRunOut(); diff --git a/specs/src/utils/aluFunctions.h b/specs/src/utils/aluFunctions.h index d91dd4d..4021fc2 100644 --- a/specs/src/utils/aluFunctions.h +++ b/specs/src/utils/aluFunctions.h @@ -40,6 +40,8 @@ "() - Returns TRUE (1) if this is the first line.","") \ X(recno, 0, ALUFUNC_REGULAR, true, \ "() - Returns the record number of the current record.","Increments with every READ or READSTOP.") \ + X(ctxrecno, 0, ALUFUNC_REGULAR, true, \ + "() - Returns the record number of the record that input parts work on.","This is similar to recno, but considers rolling context, which recno does not.") \ X(number, 0, ALUFUNC_REGULAR, true, \ "() - Returns the number of times this specification has restarted","Does not increment with READ or READSTOP. Otherwise similar to recno().") \ X(eof, 0, ALUFUNC_REGULAR, false, \ @@ -431,6 +433,7 @@ class stateQueryAgent { virtual bool isRunIn() = 0; virtual bool isRunOut() = 0; virtual ALUInt getRecordCount() = 0; + virtual ALUInt getContextOffset() = 0; virtual ALUInt getIterationCount() = 0; virtual bool breakEstablished(char id) = 0; virtual PAluValueStats valueStatistics(char id) = 0; diff --git a/specs/tests/valgrind_unit_tests.py b/specs/tests/valgrind_unit_tests.py index 466765a..c5329e2 100644 --- a/specs/tests/valgrind_unit_tests.py +++ b/specs/tests/valgrind_unit_tests.py @@ -1,7 +1,7 @@ import sys, memcheck, argparse -count_ALU_tests = 825 -count_processing_tests = 242 +count_ALU_tests = 832 +count_processing_tests = 247 count_token_tests = 17 # Parse the one command line options From d9ddcad0a74580e737fd1fce0d45890d06326a79 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Wed, 20 May 2026 17:00:57 +0300 Subject: [PATCH 18/50] Credit where credit is due (#386) --- README.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index fb65923..921dd85 100644 --- a/README.md +++ b/README.md @@ -78,8 +78,10 @@ When starting a new version: Contributors ============ -* Yoav Nir ([yoavnir](https://github.com/yoavnir)) -* Jean-Baptiste Jouband ([Gawesomer](https://github.com/Gawesomer)) +- Yoav Nir ([yoavnir](https://github.com/yoavnir)) +- Jean-Baptiste Jouband ([Gawesomer](https://github.com/Gawesomer)) +- donglrd ([donglrd](https://github.com/donglrd)) +- Miriam-R-coder ([Miriam-R-coder](https://github.com/Miriam)) Documentation ============= From a97869c4021085b1724306304c65ed94c18927c3 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Wed, 20 May 2026 17:00:57 +0300 Subject: [PATCH 19/50] Credit where credit is due (#386) --- README.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index ed28841..91fa2b1 100644 --- a/README.md +++ b/README.md @@ -79,8 +79,10 @@ When starting a new version: Contributors ============ -* Yoav Nir ([yoavnir](https://github.com/yoavnir)) -* Jean-Baptiste Jouband ([Gawesomer](https://github.com/Gawesomer)) +- Yoav Nir ([yoavnir](https://github.com/yoavnir)) +- Jean-Baptiste Jouband ([Gawesomer](https://github.com/Gawesomer)) +- donglrd ([donglrd](https://github.com/donglrd)) +- Miriam-R-coder ([Miriam-R-coder](https://github.com/Miriam)) Documentation ============= From 01e17e584a45fbd9db556ba40f0bc035225071dd Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Sun, 24 May 2026 17:29:45 +0300 Subject: [PATCH 20/50] Improve debugging of a python function rec --- specs/src/gdb/specs_gdb.py | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/specs/src/gdb/specs_gdb.py b/specs/src/gdb/specs_gdb.py index 9f0799c..a6a0591 100644 --- a/specs/src/gdb/specs_gdb.py +++ b/specs/src/gdb/specs_gdb.py @@ -1338,15 +1338,15 @@ def invoke(self, arg, from_tty): # Access members try: m_name = std_string_to_str(val["m_name"]) - print(f" m_name: {m_name}") + print(f" Name: {m_name}") except Exception as e: - print(f" m_name: (error: {e})") + print(f" Name: (error: {e})") try: m_pFuncPtr = val["m_pFuncPtr"] - print(f" m_pFuncPtr: {m_pFuncPtr}") + print(f" Func Ptr: {m_pFuncPtr}") except Exception as e: - print(f" m_pFuncPtr: (error: {e})") + print(f" Func Ptr: (error: {e})") # Show m_doc try: @@ -1354,22 +1354,22 @@ def invoke(self, arg, from_tty): if m_doc: # Format multi-line docs nicely if "\n" in m_doc: - print(f" m_doc:") + print(f" doc:") for line in m_doc.split("\n"): print(f" {line}") else: - print(f" m_doc: {m_doc}") + print(f" doc: {m_doc}") else: - print(f" m_doc: (empty)") + print(f" doc: (empty)") except Exception as e: - print(f" m_doc: (error: {e})") + print(f" doc: (error: {e})") # Show m_argTypeExact try: m_argTypeExact = bool(val["m_argTypeExact"]) - print(f" m_argTypeExact: {m_argTypeExact}") + print(" Arg Type: {}".format("exact" if m_argTypeExact else "no exactness information")) except Exception as e: - print(f" m_argTypeExact: (error: {e})") + print(f" Arg Type: (error: {e})") # Show m_pTuple try: @@ -1385,7 +1385,7 @@ def invoke(self, arg, from_tty): try: m_args = val["m_args"] arg_size = std_vector_size(m_args) - print(f" m_args ({arg_size} items):") + print(f" Args ({arg_size} items):") # Try to iterate and dump each argument for i in range(arg_size): From 1c6ee1e4a211c49c3f130b1bdea20f50d9624551 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Mon, 25 May 2026 17:23:19 +0300 Subject: [PATCH 21/50] Issue #388 - Print Python functions in dump of collection (#390) --- specs/docs/debugging.md | 28 +- specs/src/gdb/COMMANDS.md | 19 +- specs/src/gdb/specs.gdb | 20 +- specs/src/gdb/specs_gdb.py | 867 +++++++++++++++++++++++++++++++------ 4 files changed, 786 insertions(+), 148 deletions(-) diff --git a/specs/docs/debugging.md b/specs/docs/debugging.md index e2dcb1b..9f923a6 100644 --- a/specs/docs/debugging.md +++ b/specs/docs/debugging.md @@ -294,25 +294,29 @@ Breakpoint 1 at 0x... ... Breakpoint 1, PyObject_CallObject (...) at PythonIntf.cc:167 -(gdb) dump_python_func_collection g_PythonFunctions +(gdb) dump_python_func_collection gFunctionCollection PythonFunctionCollection @ 0x... - m_Initialized: true - m_Functions @ 0x... + Initialized: true + Functions (3 entries): + my_custom_function @ 0x... (2 args) + another_func @ 0x... (exact, 1 args) + third_func @ 0x... (0 args) -(gdb) dump_python_func_rec g_PythonFunctions.m_Functions[0] +(gdb) dump_python_func_by_name gFunctionCollection my_custom_function PythonFuncRec @ 0x... - m_name: my_custom_function - m_pFuncPtr: 0x... - m_doc: Computes the custom value based on input - m_pTuple: 0x... - m_args (2 items): - [0] input_value (default: counterType__Int) + Name: my_custom_function + Func Ptr: 0x... + doc: Computes the custom value based on input + Arg Type: no exactness information + Tuple: 0x... + Args (2 items): + [0] input_value (default: Int) = 0 - [1] multiplier (default: counterType__Float) + [1] multiplier (default: Float) = 1.5 ``` -This shows you the complete function signature, documentation, and argument defaults. The `m_pTuple` field shows whether arguments have been prepared for the function call. +This shows you the complete function signature, documentation, and argument defaults. The `Tuple` field shows whether arguments have been prepared for the function call. The collection dump now lists all functions with their key properties (exactness and argument count). --- diff --git a/specs/src/gdb/COMMANDS.md b/specs/src/gdb/COMMANDS.md index 2c85084..00bb809 100644 --- a/specs/src/gdb/COMMANDS.md +++ b/specs/src/gdb/COMMANDS.md @@ -68,7 +68,8 @@ || `dump-alu-function` | `dump_alu_function` | Dump an AluFunction (name, arg count, input dependency) | || `dump-external-function-rec` | `dump_external_func_rec` | Dump an ExternalFunctionRec (calls virtual methods GetArgCount/GetFuncPtr) | || `dump-external-function-collection` | `dump_external_func_collection` | Dump an ExternalFunctionCollection (initialization state) | -|| `dump-python-function-collection` | `dump_python_func_collection` | Dump a PythonFunctionCollection (registry state and function count) | +|| `dump-python-function-collection` | `dump_python_func_collection` | Dump a PythonFunctionCollection (registry state, function count, and function list) | +|| `dump-python-func-by-name` | `dump_python_func_by_name` | Dump a PythonFuncRec by looking it up in a collection by name | || `dump-python-func-rec` | `dump_python_func_rec` | Dump a PythonFuncRec (name, pointer, doc, and expanded argument list) | || `dump-python-func-arg` | `dump_python_func_arg` | Dump a PythonFuncArg (name, default type, and default value) | @@ -83,7 +84,7 @@ || Command | Description | ||---------|-------------| -|| `bp_apply` | Set breakpoint on Item::apply | +|| `bp_apply` | Set breakpoints on all 9 Item subclass apply methods | || `bp_getstr` | Set breakpoint on InputPart::getStr | || `bp_compile` | Set breakpoint on itemGroup::Compile | || `bp_parseAluExpression` | Set breakpoint on parseAluExpression, where expressions are parsed | @@ -126,12 +127,20 @@ ALUValue @ 0x... m_exact: true ``` -### Set Breakpoint on Item::apply +### Set Breakpoints on Item::apply ```gdb (gdb) bp_apply -Breakpoint 1 at 0x... +Breakpoint 1 at 0x...: DataField::apply +Breakpoint 2 at 0x...: TokenItem::apply +Breakpoint 3 at 0x...: SetItem::apply +Breakpoint 4 at 0x...: SkipItem::apply +Breakpoint 5 at 0x...: ConditionItem::apply +Breakpoint 6 at 0x...: BreakItem::apply +Breakpoint 7 at 0x...: SelectItem::apply +Breakpoint 8 at 0x...: SplitItem::apply +Breakpoint 9 at 0x...: ContextItem::apply (gdb) run -Breakpoint 1, Item::apply (this=0x..., pState=0x..., pSB=0x...) at specitems/specItems.cc:... +Breakpoint 1, DataField::apply (this=0x..., pState=..., pSB=0x...) at specitems/dataField.cc:... (gdb) dump_pstate pState ``` diff --git a/specs/src/gdb/specs.gdb b/specs/src/gdb/specs.gdb index 630ad02..954ba38 100644 --- a/specs/src/gdb/specs.gdb +++ b/specs/src/gdb/specs.gdb @@ -173,6 +173,10 @@ define dump_python_func_collection dump-python-function-collection $arg0 end +define dump_python_func_by_name + dump-python-func-by-name $arg0 $arg1 +end + define dump_python_func_rec dump-python-func-rec $arg0 end @@ -191,10 +195,18 @@ end # ============================================================================ define bp_apply - break Item::apply + break DataField::apply + break TokenItem::apply + break SetItem::apply + break SkipItem::apply + break ConditionItem::apply + break BreakItem::apply + break SelectItem::apply + break SplitItem::apply + break ContextItem::apply end document bp_apply - Set a breakpoint on Item::apply to debug item application. + Set breakpoints on every Item subclass's apply method (9 breakpoints). end define bp_getstr @@ -276,12 +288,14 @@ echo dump_alu_counters - Dump ALUCounters\n echo dump_alu_vec - Dump AluVec\n echo dump_alu_function - Dump AluFunction\n echo dump_external_func_rec - Dump ExternalFunctionRec\n +echo dump_python_func_collection - Dump PythonFunctionCollection\n +echo dump_python_func_by_name - Dump PythonFuncRec by name\n echo dump_python_func_rec - Dump PythonFuncRec\n echo dump_python_func_arg - Dump PythonFuncArg\n echo dump_exception - Dump SpecsException\n echo \n echo Breakpoint helpers:\n -echo bp_apply - Break on Item::apply\n +echo bp_apply - Break on all 9 Item subclass apply methods\n echo bp_getstr - Break on InputPart::getStr\n echo bp_compile - Break on itemGroup::Compile\n echo bp_parseAluExpression - Break on parseAluExpression, where expressions are parsed\n diff --git a/specs/src/gdb/specs_gdb.py b/specs/src/gdb/specs_gdb.py index a6a0591..1da5c5d 100644 --- a/specs/src/gdb/specs_gdb.py +++ b/specs/src/gdb/specs_gdb.py @@ -158,6 +158,18 @@ 3: "NullStr", } +EXTREME_BOOL = { + 0: "False", + 1: "True", + 2: "DontCare", +} + +INPUT_STATION = { + -1: "FIRST", + -2: "SECOND", + 0: "STDERR", +} + # Token types (X-macro generated, simplified list) TOKEN_TYPES = { 0: "STOP", @@ -285,6 +297,65 @@ def std_vector_size(val): except: return 0 +def std_map_size(val): + """Get the size of a std::map.""" + try: + return int(val["_M_t"]["_M_impl"]["_M_node_count"]) + except: + return 0 + +def std_map_items(val): + """ + Iterate over key-value pairs in a std::map. + Uses GDB's default visualizer (libstdc++ StdMapPrinter). + Returns list of (key, value) tuples. + + Note: The libstdc++ StdMapPrinter yields *alternating* children: + [0] -> first key, [1] -> first value, + [2] -> second key, [3] -> second value, ... + This function pairs them up into (key, value) tuples. + """ + try: + items = [] + pp = gdb.default_visualizer(val) + if pp and hasattr(pp, 'children'): + children = list(pp.children()) + # Pair up alternating key/value entries + for i in range(0, len(children) - 1, 2): + key = children[i][1] # the gdb.Value for the key + value = children[i + 1][1] # the gdb.Value for the value + items.append((key, value)) + return items + except: + return [] + +def std_vector_int_items(val): + """ + Extract all elements from a std::vector as a Python list. + Returns an empty list on failure. + """ + try: + size = std_vector_size(val) + start = val["_M_impl"]["_M_start"] + return [int(start[i]) for i in range(size)] + except: + return [] + +def std_stack_items(val): + """ + Extract all elements from a std::stack as a Python list (bottom to top). + std::stack wraps a deque in its 'c' member. Uses GDB's pretty-printer + for the underlying deque to iterate. + """ + try: + deque = val["c"] + pp = gdb.default_visualizer(deque) + if pp and hasattr(pp, 'children'): + return [child[1] for child in pp.children()] + return [] + except: + return [] + def identify_dynamic_type(val): """ Identify the actual derived type of a polymorphic object by reading the vtable. @@ -313,6 +384,137 @@ def call_method_safe(val, method_name, *args): except: return None +def pyobj_repr(pyobj_ptr): + """ + Given a GDB value that is a PyObject*, return a human-readable string + representation of the Python object. Handles int, float, str, bool, + None, and tuple. Falls back to showing the type name for anything else. + """ + try: + if int(pyobj_ptr) == 0: + return "NULL" + + # Read ob_type->tp_name to discover the Python type + ob_type = pyobj_ptr["ob_type"] + tp_name = ob_type["tp_name"].string() + + if tp_name == "NoneType": + return "None" + + if tp_name == "bool": + # True/False are singletons; compare the pointer value to _Py_TrueStruct + try: + py_true = gdb.parse_and_eval("(PyObject*)&_Py_TrueStruct") + if int(pyobj_ptr) == int(py_true): + return "True" + return "False" + except: + return "" + + if tp_name == "float": + try: + float_type = gdb.lookup_type("PyFloatObject").pointer() + fval = pyobj_ptr.cast(float_type)["ob_fval"] + return str(float(fval)) + except: + return "" + + if tp_name == "int": + try: + long_type = gdb.lookup_type("PyLongObject").pointer() + long_obj = pyobj_ptr.cast(long_type) + lv_tag = int(long_obj["long_value"]["lv_tag"]) + # _PyLong_NON_SIZE_BITS = 3, _PyLong_SIGN_MASK = 3 + is_compact = lv_tag < (2 << 3) + if is_compact: + sign = 1 - (lv_tag & 3) + digit = int(long_obj["long_value"]["ob_digit"][0]) + return str(sign * digit) + else: + return "" + except: + return "" + + if tp_name == "str": + try: + ascii_type = gdb.lookup_type("PyASCIIObject").pointer() + ascii_obj = pyobj_ptr.cast(ascii_type) + length = int(ascii_obj["length"]) + kind = int(ascii_obj["state"]["kind"]) + is_ascii = int(ascii_obj["state"]["ascii"]) + is_compact = int(ascii_obj["state"]["compact"]) + if is_ascii and is_compact and kind == 1: + # Data follows immediately after the PyASCIIObject struct + data_ptr = (ascii_obj + 1).cast(gdb.lookup_type("char").pointer()) + s = data_ptr.string(length=min(length, 80)) + if length > 80: + return f'"{s}..."' + return f'"{s}"' + elif is_compact and kind == 1: + # Latin-1, data after PyCompactUnicodeObject + compact_type = gdb.lookup_type("PyCompactUnicodeObject").pointer() + compact_obj = pyobj_ptr.cast(compact_type) + data_ptr = (compact_obj + 1).cast(gdb.lookup_type("char").pointer()) + s = data_ptr.string(length=min(length, 80)) + if length > 80: + return f'"{s}..."' + return f'"{s}"' + else: + return f'' + except Exception as e: + return f"" + + if tp_name == "tuple": + try: + tuple_type = gdb.lookup_type("PyTupleObject").pointer() + tup = pyobj_ptr.cast(tuple_type) + # ob_size is in the PyVarObject header + var_type = gdb.lookup_type("PyVarObject").pointer() + size = int(pyobj_ptr.cast(var_type)["ob_size"]) + elems = [] + for i in range(min(size, 10)): + elem = tup["ob_item"][i] + elems.append(pyobj_repr(elem)) + inner = ", ".join(elems) + if size > 10: + inner += ", ..." + return f"({inner})" + except Exception as e: + return f"" + + # Fallback: just show the type name + return f"<{tp_name} object at {pyobj_ptr}>" + except: + return f"" + +def dump_pytuple(pyobj_ptr, arg_dict, indent=" "): + """ + Given a GDB value that is a PyObject* pointing to a Python tuple, print + its contents. The tuple holds the argument values for a Python function + call (populated by setArgValue, freed by ResetArgs). + """ + if not isinstance(arg_dict,dict): + arg_dict = dict() + try: + if int(pyobj_ptr) == 0: + print(f"{indent}Tuple: nullptr (no call arguments prepared)") + return + + var_type = gdb.lookup_type("PyVarObject").pointer() + size = int(pyobj_ptr.cast(var_type)["ob_size"]) + tuple_type = gdb.lookup_type("PyTupleObject").pointer() + tup = pyobj_ptr.cast(tuple_type) + + print(f"{indent}Tuple ({size} call args):") + for i in range(size): + elem = tup["ob_item"][i] + if i in arg_dict.keys(): + print(f"{indent} [{i}] {arg_dict[i]} = {pyobj_repr(elem)}") + else: + print(f"{indent} [{i}] {pyobj_repr(elem)}") + except Exception as e: + print(f"{indent}Tuple: {pyobj_ptr} (cannot inspect: {e})") + # ============================================================================ # PRETTY-PRINTERS # ============================================================================ @@ -447,7 +649,7 @@ def invoke(self, arg, from_tty): val = gdb.parse_and_eval(arg) m_str = std_string_to_str(val["m_Str"]) print(f"LiteralPart @ {val.address}") - print(f" m_Str: \"{m_str}\"") + print(f" String: \"{m_str}\"") except Exception as e: print(f"Error: {e}") @@ -463,8 +665,8 @@ def invoke(self, arg, from_tty): from_val = int(val["_from"]) to_val = int(val["_to"]) print(f"RangePart @ {val.address}") - print(f" _from: {from_val}") - print(f" _to: {to_val}") + print(f" From: {from_val}") + print(f" To: {to_val}") print(f" readsLines: true") except Exception as e: print(f"Error: {e}") @@ -482,9 +684,9 @@ def invoke(self, arg, from_tty): to_val = int(val["_to"]) sep = std_string_to_str(val["m_WordSep"]) print(f"WordRangePart @ {val.address}") - print(f" _from: {from_val}") - print(f" _to: {to_val}") - print(f" m_WordSep: \"{sep}\"") + print(f" From: {from_val}") + print(f" To: {to_val}") + print(f" Word Separator: \"{sep}\"") except Exception as e: print(f"Error: {e}") @@ -501,9 +703,9 @@ def invoke(self, arg, from_tty): to_val = int(val["_to"]) sep = std_string_to_str(val["m_FieldSep"]) print(f"FieldRangePart @ {val.address}") - print(f" _from: {from_val}") - print(f" _to: {to_val}") - print(f" m_FieldSep: \"{sep}\"") + print(f" From: {from_val}") + print(f" To: {to_val}") + print(f" Field Separator: \"{sep}\"") except Exception as e: print(f"Error: {e}") @@ -520,8 +722,8 @@ def invoke(self, arg, from_tty): type_str = CLOCK_TYPE.get(type_val, f"Unknown({type_val})") clock = int(val["m_StaticClock"]) print(f"ClockPart @ {val.address}") - print(f" m_Type: {type_str}") - print(f" m_StaticClock: {clock}") + print(f" Type: {type_str}") + print(f" Static Clock: {clock}") except Exception as e: print(f"Error: {e}") @@ -536,7 +738,7 @@ def invoke(self, arg, from_tty): val = gdb.parse_and_eval(arg) fid = std_string_to_str(val["m_fieldIdentifier"]) print(f"IDPart @ {val.address}") - print(f" m_fieldIdentifier: \"{fid}\"") + print(f" Field Identifier: \"{fid}\"") except Exception as e: print(f"Error: {e}") @@ -552,8 +754,8 @@ def invoke(self, arg, from_tty): raw_expr = std_string_to_str(val["m_rawExpression"]) is_assn = bool(val["m_isAssignment"]) print(f"ExpressionPart @ {val.address}") - print(f" m_rawExpression: \"{raw_expr}\"") - print(f" m_isAssignment: {is_assn}") + print(f" Expression: \"{raw_expr}\"") + print(f" Is Assignment: {is_assn}") except Exception as e: print(f"Error: {e}") @@ -572,7 +774,7 @@ def invoke(self, arg, from_tty): val = gdb.parse_and_eval(arg) orig_idx = int(val["m_originalIndex"]) print(f"Item @ {val.address}") - print(f" m_originalIndex: {orig_idx}") + print(f" Original Index: {orig_idx}") # Try to call virtual methods try: @@ -636,12 +838,12 @@ def invoke(self, arg, from_tty): conv_str = STRING_CONVERSIONS.get(conv, f"Unknown({conv})") align_str = OUTPUT_ALIGNMENT.get(align, f"Unknown({align})") - print(f" m_label: {label}") - print(f" m_outStart: {out_start}") - print(f" m_maxLength: {max_len}") - print(f" m_strip: {strip}") - print(f" m_conversion: {conv_str}") - print(f" m_alignment: {align_str}") + print(f" Label: {label}") + print(f" Output Start: {out_start}") + print(f" Max Length: {max_len}") + print(f" Strip: {strip}") + print(f" Conversion: {conv_str}") + print(f" Alignment: {align_str}") except Exception as e: print(f"Error: {e}") @@ -689,8 +891,8 @@ def invoke(self, arg, from_tty): val = gdb.parse_and_eval(arg) raw_expr = std_string_to_str(val["m_rawExpression"]) key = int(val["m_key"]) - print(f" m_rawExpression: \"{raw_expr}\"") - print(f" m_key: {key}") + print(f" Expression: \"{raw_expr}\"") + print(f" Key: {key}") except Exception as e: print(f"Error: {e}") @@ -715,8 +917,8 @@ def invoke(self, arg, from_tty): satisfied = bool(val["m_bSatisfied"]) skip_type = "SKIPUNTIL" if is_until else "SKIPWHILE" print(f" Type: {skip_type}") - print(f" m_rawExpression: \"{raw_expr}\"") - print(f" m_bSatisfied: {satisfied}") + print(f" Expression: \"{raw_expr}\"") + print(f" Satisfied: {satisfied}") except Exception as e: print(f"Error: {e}") @@ -740,9 +942,9 @@ def invoke(self, arg, from_tty): pred_str = CONDITION_PREDICATE.get(pred, f"Unknown({pred})") raw_expr = std_string_to_str(val["m_rawExpression"]) is_assn = bool(val["m_isAssignment"]) - print(f" m_pred: {pred_str}") - print(f" m_rawExpression: \"{raw_expr}\"") - print(f" m_isAssignment: {is_assn}") + print(f" Predicate: {pred_str}") + print(f" Expression: \"{raw_expr}\"") + print(f" Is Assignment: {is_assn}") except Exception as e: print(f"Error: {e}") @@ -763,7 +965,7 @@ def invoke(self, arg, from_tty): # Then print derived class fields val = gdb.parse_and_eval(arg) ident = chr(int(val["m_identifier"])) - print(f" m_identifier: {ident}") + print(f" Identifier: {ident}") except Exception as e: print(f"Error: {e}") @@ -784,7 +986,7 @@ def invoke(self, arg, from_tty): # Then print derived class fields val = gdb.parse_and_eval(arg) offset = int(val["m_offset"]) - print(f" m_offset: {offset}") + print(f" Offset: {offset}") except Exception as e: print(f"Error: {e}") @@ -806,8 +1008,8 @@ def invoke(self, arg, from_tty): val = gdb.parse_and_eval(arg) stream = int(val["m_stream"]) b_output = bool(val["bOutput"]) - print(f" m_stream: {stream}") - print(f" bOutput: {b_output}") + print(f" Stream: {stream}") + print(f" Output: {b_output}") except Exception as e: print(f"Error: {e}") @@ -833,9 +1035,9 @@ def invoke(self, arg, from_tty): current_piece = int(val["m_currentPiece"]) split_type = "SPLITF" if is_field else "SPLITW" print(f" Type: {split_type}") - print(f" m_separator: \"{sep}\"") - print(f" m_splitting: {splitting}") - print(f" m_currentPiece: {current_piece}") + print(f" Separator: \"{sep}\"") + print(f" Splitting: {splitting}") + print(f" Current Piece: {current_piece}") except Exception as e: print(f"Error: {e}") @@ -860,8 +1062,8 @@ def invoke(self, arg, from_tty): item_count = std_vector_size(items_vec) print(f"itemGroup @ {val.address}") - print(f" bNeedRunoutCycle: {need_runout}") - print(f" bFoundSelectSecond: {found_second}") + print(f" Need Runout Cycle: {need_runout}") + print(f" Found Select Second: {found_second}") print(f" Item count: {item_count}") print(f" Items:") @@ -898,10 +1100,10 @@ def invoke(self, arg, from_tty): orig = std_string_to_str(val["m_orig"]) print(f"Token @ {val.address}") - print(f" m_type: {type_str}") - print(f" m_literal: \"{literal}\"") - print(f" m_argc: {argc}") - print(f" m_orig: \"{orig}\"") + print(f" Type: {type_str}") + print(f" Literal: \"{literal}\"") + print(f" Arg Count: {argc}") + print(f" Original: \"{orig}\"") except Exception as e: print(f"Error: {e}") @@ -921,12 +1123,12 @@ def invoke(self, arg, from_tty): first = int(val["m_first"]) last = int(val["m_last"]) print(f"TokenFieldRangeSimple @ {val.address}") - print(f" m_first: {first}") - print(f" m_last: {last}") - print(f" bDone: {b_done}") + print(f" First: {first}") + print(f" Last: {last}") + print(f" Done: {b_done}") except: print(f"TokenFieldRange @ {val.address}") - print(f" bDone: {b_done}") + print(f" Done: {b_done}") except Exception as e: print(f"Error: {e}") @@ -940,53 +1142,279 @@ class DumpProcessingState(gdb.Command): def __init__(self): super(DumpProcessingState, self).__init__("dump-processing-state", gdb.COMMAND_DATA) + @staticmethod + def _fmt_record(shared_str): + """Format a PSpecString (shared_ptr) for display.""" + obj = deref_shared_ptr(shared_str) + if obj is None: + return "" + s = std_string_to_str(obj) + if len(s) > 60: + return f"\"{s[:60]}...\" (len={len(s)})" + return f"\"{s}\"" + def invoke(self, arg, from_tty): try: val = gdb.parse_and_eval(arg) + print(f"ProcessingState @ {val.address}") - # Current record - ps = deref_shared_ptr(val["m_ps"]) - if ps: - record_str = std_string_to_str(ps) - else: - record_str = "" + # --- Records --- + print(f" Current Record: {self._fmt_record(val['m_ps'])}") + print(f" Previous Record: {self._fmt_record(val['m_prevPs'])}") + print(f" Input Record: {self._fmt_record(val['m_inputRecord'])}") - # Previous record - prev_ps = deref_shared_ptr(val["m_prevPs"]) - if prev_ps: - prev_record_str = std_string_to_str(prev_ps) - else: - prev_record_str = "" + # --- Separators & Padding --- + try: + pad = chr(int(val["m_pad"])) + print(f" Pad Char: '{pad}' (0x{ord(pad):02x})") + except Exception as e: + print(f" Pad Char: (error: {e})") - pad = chr(int(val["m_pad"])) - word_sep = std_string_to_str(val["m_wordSeparator"]) - field_sep = std_string_to_str(val["m_fieldSeparator"]) - cycle = int(val["m_CycleCounter"]) - extra_reads = int(val["m_ExtraReads"]) - word_count = int(val["m_wordCount"]) - field_count = int(val["m_fieldCount"]) - input_station = int(val["m_inputStation"]) - input_stream = int(val["m_inputStream"]) - output_idx = int(val["m_outputIndex"]) - no_write = bool(val["m_bNoWrite"]) - eof = bool(val["m_bEOF"]) + try: + ws_local = bool(val["m_wordSeparatorLocal"]) + word_sep = std_string_to_str(val["m_wordSeparator"]) + local_tag = " (local)" if ws_local else "" + print(f" Word Separator: \"{word_sep}\"{local_tag}") + except Exception as e: + print(f" Word Separator: (error: {e})") - print(f"ProcessingState @ {val.address}") - print(f" Current Record: \"{record_str[:50]}{'...' if len(record_str) > 50 else ''}\"") - print(f" Previous Record: \"{prev_record_str[:50]}{'...' if len(prev_record_str) > 50 else ''}\"") - print(f" Pad Char: '{pad}' (0x{ord(pad):02x})") - print(f" Word Separator: \"{word_sep}\"") - print(f" Field Separator: \"{field_sep}\"") - print(f" Cycle Counter: {cycle}") - print(f" Extra Reads: {extra_reads}") - print(f" Record Count: {cycle + extra_reads}") - print(f" Word Count: {word_count}") - print(f" Field Count: {field_count}") - print(f" Input Station: {input_station}") - print(f" Input Stream: {input_stream}") - print(f" Output Index: {output_idx}") - print(f" No Write: {no_write}") - print(f" EOF: {eof}") + try: + field_sep = std_string_to_str(val["m_fieldSeparator"]) + print(f" Field Separator: \"{field_sep}\"") + except Exception as e: + print(f" Field Separator: (error: {e})") + + # --- Counters --- + try: + cycle = int(val["m_CycleCounter"]) + extra_reads = int(val["m_ExtraReads"]) + context_offset = int(val["m_contextOffset"]) + print(f" Cycle Counter: {cycle}") + print(f" Extra Reads: {extra_reads}") + print(f" Record Count: {cycle + extra_reads}") + print(f" Context Offset: {context_offset}") + except Exception as e: + print(f" Counters: (error: {e})") + + # --- Word / Field Caches --- + try: + word_count = int(val["m_wordCount"]) + field_count = int(val["m_fieldCount"]) + print(f" Word Count: {word_count}") + print(f" Field Count: {field_count}") + except Exception as e: + print(f" Word/Field Count: (error: {e})") + + try: + ws = std_vector_int_items(val["m_wordStart"]) + we = std_vector_int_items(val["m_wordEnd"]) + n = len(ws) + print(f" Word Positions ({n} cached):") + if n == 0: + print(f" (none)") + else: + limit = min(n, 20) + for i in range(limit): + end_val = we[i] if i < len(we) else "?" + print(f" [{i}] {ws[i]}-{end_val}") + if n > 20: + print(f" ... and {n - 20} more") + except Exception as e: + print(f" Word Positions: (error: {e})") + + try: + fs = std_vector_int_items(val["m_fieldStart"]) + fe = std_vector_int_items(val["m_fieldEnd"]) + n = len(fs) + print(f" Field Positions ({n} cached):") + if n == 0: + print(f" (none)") + else: + limit = min(n, 20) + for i in range(limit): + end_val = fe[i] if i < len(fe) else "?" + print(f" [{i}] {fs[i]}-{end_val}") + if n > 20: + print(f" ... and {n - 20} more") + except Exception as e: + print(f" Field Positions: (error: {e})") + + # --- Field Identifiers --- + try: + fi = val["m_fieldIdentifiers"] + fi_size = std_map_size(fi) + print(f" Field Identifiers ({fi_size} entries):") + if fi_size == 0: + print(f" (none)") + else: + items = std_map_items(fi) + for key, value in items: + try: + k = chr(int(key)) + s = deref_shared_ptr(value) + v = std_string_to_str(s) if s else "" + if len(v) > 40: + v = v[:40] + "..." + print(f" '{k}' = \"{v}\"") + except: + pass + except Exception as e: + print(f" Field Identifiers: (error: {e})") + + # --- FI Statistics --- + try: + fis = val["m_fiStatistics"] + fis_size = std_map_size(fis) + print(f" FI Statistics ({fis_size} entries):") + if fis_size == 0: + print(f" (none)") + else: + items = std_map_items(fis) + for key, value in items: + try: + k = chr(int(key)) + stats = deref_shared_ptr(value) + if stats: + total = int(stats["m_totalCount"]) + int_count = int(stats["m_intCount"]) + float_count = int(stats["m_floatCount"]) + print(f" '{k}': {total} values ({int_count} int, {float_count} float)") + else: + print(f" '{k}': ") + except: + pass + except Exception as e: + print(f" FI Statistics: (error: {e})") + + # --- Break Values --- + try: + bv = val["m_breakValues"] + bv_size = std_map_size(bv) + print(f" Break Values ({bv_size} entries):") + if bv_size == 0: + print(f" (none)") + else: + items = std_map_items(bv) + for key, value in items: + try: + k = chr(int(key)) + s = deref_shared_ptr(value) + v = std_string_to_str(s) if s else "" + if len(v) > 40: + v = v[:40] + "..." + print(f" '{k}' = \"{v}\"") + except: + pass + except Exception as e: + print(f" Break Values: (error: {e})") + + try: + bl = int(val["m_breakLevel"]) + if bl == 0: + print(f" Break Level: (none)") + else: + print(f" Break Level: '{chr(bl)}' (0x{bl:02x})") + except Exception as e: + print(f" Break Level: (error: {e})") + + # --- Frequency Maps --- + try: + fm = val["m_freqMaps"] + fm_size = std_map_size(fm) + print(f" Frequency Maps ({fm_size} entries):") + if fm_size == 0: + print(f" (none)") + else: + items = std_map_items(fm) + for key, value in items: + try: + k = chr(int(key)) + fmap = deref_shared_ptr(value) + if fmap: + nelem = int(fmap["map"]["_M_element_count"]) + counter = int(fmap["counter"]) + print(f" '{k}': {nelem} unique elements, {counter} total") + else: + print(f" '{k}': ") + except: + pass + except Exception as e: + print(f" Frequency Maps: (error: {e})") + + # --- Conditions Stack --- + try: + cond_items = std_stack_items(val["m_Conditions"]) + depth = len(cond_items) + print(f" Conditions ({depth} deep):") + if depth == 0: + print(f" (empty)") + else: + for i in range(depth - 1, -1, -1): + label = "top -> " if i == depth - 1 else " " + v = int(cond_items[i]) + name = EXTREME_BOOL.get(v, f"Unknown({v})") + print(f" {label}{name}") + except Exception as e: + print(f" Conditions: (error: {e})") + + # --- Loops Stack --- + try: + loop_items = std_stack_items(val["m_Loops"]) + depth = len(loop_items) + print(f" Loops ({depth} deep):") + if depth == 0: + print(f" (empty)") + else: + for i in range(depth - 1, -1, -1): + label = "top -> " if i == depth - 1 else " " + v = int(loop_items[i]) + print(f" {label}token #{v}") + except Exception as e: + print(f" Loops: (error: {e})") + + # --- I/O State --- + try: + input_station = int(val["m_inputStation"]) + station_name = INPUT_STATION.get(input_station, f"Stream({input_station})") + print(f" Input Station: {station_name}") + except Exception as e: + print(f" Input Station: (error: {e})") + + try: + input_stream = int(val["m_inputStream"]) + print(f" Input Stream: {input_stream}") + except Exception as e: + print(f" Input Stream: (error: {e})") + + try: + stream_changed = bool(val["m_inputStreamChanged"]) + print(f" Stream Changed: {stream_changed}") + except Exception as e: + print(f" Stream Changed: (error: {e})") + + try: + writers = val["m_Writers"] + print(f" Writers: {writers}") + except Exception as e: + print(f" Writers: (error: {e})") + + try: + output_idx = int(val["m_outputIndex"]) + print(f" Output Index: {output_idx}") + except Exception as e: + print(f" Output Index: (error: {e})") + + try: + no_write = bool(val["m_bNoWrite"]) + print(f" No Write: {no_write}") + except Exception as e: + print(f" No Write: (error: {e})") + + try: + eof = bool(val["m_bEOF"]) + print(f" EOF: {eof}") + except Exception as e: + print(f" EOF: (error: {e})") except Exception as e: print(f"Error: {e}") @@ -1032,10 +1460,10 @@ def invoke(self, arg, from_tty): b_ran_dry = bool(val["m_bRanDry"]) print(f"Reader @ {val.address}") - print(f" m_countRead: {count_read}") - print(f" m_countUsed: {count_used}") - print(f" m_bAbort: {b_abort}") - print(f" m_bRanDry: {b_ran_dry}") + print(f" Count Read: {count_read}") + print(f" Count Used: {count_used}") + print(f" Abort: {b_abort}") + print(f" Ran Dry: {b_ran_dry}") except Exception as e: print(f"Error: {e}") @@ -1053,9 +1481,9 @@ def invoke(self, arg, from_tty): ended = bool(val["m_ended"]) print(f"Writer @ {val.address}") - print(f" m_countGenerated: {count_gen}") - print(f" m_countWritten: {count_written}") - print(f" m_ended: {ended}") + print(f" Count Generated: {count_gen}") + print(f" Count Written: {count_written}") + print(f" Ended: {ended}") except Exception as e: print(f"Error: {e}") @@ -1078,9 +1506,9 @@ def invoke(self, arg, from_tty): exact = bool(val["m_exact"]) print(f"ALUValue @ {val.address}") - print(f" m_type: {type_str}") - print(f" m_value: \"{value_str}\"") - print(f" m_exact: {exact}") + print(f" Type: {type_str}") + print(f" Value: \"{value_str}\"") + print(f" Exact: {exact}") except Exception as e: print(f"Error: {e}") @@ -1097,7 +1525,7 @@ def invoke(self, arg, from_tty): print(f"ALUCounters @ {val.address}") print(f" Counters (map):") # Simplified: just show the address - print(f" m_map @ {val['m_map'].address}") + print(f" Map @ {val['m_map'].address}") except Exception as e: print(f"Error: {e}") @@ -1163,9 +1591,9 @@ def invoke(self, arg, from_tty): total_count = int(val["m_totalCount"]) print(f"AluValueStats @ {val.address}") - print(f" m_intCount: {int_count}") - print(f" m_floatCount: {float_count}") - print(f" m_totalCount: {total_count}") + print(f" Int Count: {int_count}") + print(f" Float Count: {float_count}") + print(f" Total Count: {total_count}") except Exception as e: print(f"Error: {e}") @@ -1230,9 +1658,9 @@ def invoke(self, arg, from_tty): relies_on_input = bool(val["m_reliesOnInput"]) print(f"AluFunction @ {val.address}") - print(f" m_FuncName: {func_name}") - print(f" m_ArgCount: {arg_count}") - print(f" m_reliesOnInput: {relies_on_input}") + print(f" Function Name: {func_name}") + print(f" Arg Count: {arg_count}") + print(f" Relies On Input: {relies_on_input}") except Exception as e: print(f"Error: {e}") @@ -1306,33 +1734,100 @@ def invoke(self, arg, from_tty): try: val = gdb.parse_and_eval(arg) + print(f"PythonFunctionCollection @ {val.address}") + # Access m_Initialized try: m_initialized = bool(val["m_Initialized"]) - print(f"PythonFunctionCollection @ {val.address}") - print(f" m_Initialized: {m_initialized}") + print(f" Initialized: {m_initialized}") + except Exception as e: + print(f" Initialized: (error: {e})") + + # Try to access m_Functions map and iterate + try: + m_functions = val["m_Functions"] + map_size = std_map_size(m_functions) + print(f" Functions ({map_size} entries):") - # Try to access m_Functions map (simplified) - try: - m_functions = val["m_Functions"] - print(f" m_Functions @ {m_functions.address}") - except: - pass - except: - print(f"PythonFunctionCollection @ {val.address}") + if map_size == 0: + print(f" (empty)") + else: + # Iterate the map using std_map_items + items = std_map_items(m_functions) + if items: + for key, value in items: + try: + # value is shared_ptr + func_rec = deref_shared_ptr(value) + if func_rec: + func_name = std_string_to_str(func_rec["m_name"]).ljust(20) + func_ptr = func_rec["m_pFuncPtr"] + arg_type_exact = bool(func_rec["m_argTypeExact"]) + arg_count = std_vector_size(func_rec["m_args"]) + + exact_str = ", exact arguments" if arg_type_exact else "" + print(f" {func_name} @ {func_ptr} ({arg_count} args{exact_str})") + except: + pass + else: + print(f" (iteration failed)") + except Exception as e: + print(f" Functions: (error: {e})") except Exception as e: print(f"Error: {e}") -class DumpPythonFuncRec(gdb.Command): - """Dump a PythonFuncRec (internal class from PythonIntf.cc).""" +class DumpPythonFuncByName(gdb.Command): + """Dump a PythonFuncRec by looking it up in a PythonFunctionCollection by name.""" def __init__(self): - super(DumpPythonFuncRec, self).__init__("dump-python-func-rec", gdb.COMMAND_DATA) + super(DumpPythonFuncByName, self).__init__("dump-python-func-by-name", gdb.COMMAND_DATA) def invoke(self, arg, from_tty): try: - val = gdb.parse_and_eval(arg) + # Parse arguments: collection_expr function_name + args = arg.split(None, 1) + if len(args) < 2: + print("Usage: dump-python-func-by-name ") + return + + collection_expr = args[0] + func_name_arg = args[1] + + # Remove quotes if present + if func_name_arg.startswith('"') and func_name_arg.endswith('"'): + func_name_arg = func_name_arg[1:-1] + elif func_name_arg.startswith("'") and func_name_arg.endswith("'"): + func_name_arg = func_name_arg[1:-1] + + # Evaluate the collection + collection = gdb.parse_and_eval(collection_expr) + + # Access the m_Functions map and find the function by name + m_functions = collection["m_Functions"] + items = std_map_items(m_functions) + + found = False + if items: + for key, value in items: + try: + key_str = std_string_to_str(key) + if key_str == func_name_arg: + func_rec = deref_shared_ptr(value) + if func_rec: + self._dump_python_func_rec(func_rec) + found = True + break + except: + pass + if not found: + print(f"Function '{func_name_arg}' not found in collection") + except Exception as e: + print(f"Error: {e}") + + def _dump_python_func_rec(self, val): + """Dump a PythonFuncRec (reuses logic from DumpPythonFuncRec).""" + try: print(f"PythonFuncRec @ {val.address}") # Access members @@ -1371,31 +1866,138 @@ def invoke(self, arg, from_tty): except Exception as e: print(f" Arg Type: (error: {e})") - # Show m_pTuple + # Expand m_args vector + argDict = dict() + try: + m_args = val["m_args"] + arg_size = std_vector_size(m_args) + print(f" Args ({arg_size} items):") + + # Try to iterate and dump each argument + args_start = m_args["_M_impl"]["_M_start"] + for i in range(arg_size): + try: + arg_elem = args_start[i] + arg_name = std_string_to_str(arg_elem["m_name"]) + arg_default = int(arg_elem["m_default"]) + arg_default_str = ALU_COUNTER_TYPE.get(arg_default, f"Unknown({arg_default})") + + print(f" [{i}] {arg_name} (default: {arg_default_str})") + argDict[i] = arg_name + + # Show default value if present + if arg_default == 1: # counterType__Str + try: + defStr = std_string_to_str(arg_elem["m_defStr"]) + print(f" = \"{defStr}\"") + except: + pass + elif arg_default == 2: # counterType__Int + try: + defInt = int(arg_elem["m_defInt"]) + print(f" = {defInt}") + except: + pass + elif arg_default == 3: # counterType__Float + try: + defFloat = float(arg_elem["m_defFloat"]) + print(f" = {defFloat}") + except: + pass + except Exception as arg_e: + print(f" [{i}] (error: {arg_e})") + except Exception as e: + print(f" m_args: (error: {e})") + + # Show m_pTuple (argument values for the current Python function call) try: m_pTuple = val["m_pTuple"] - if m_pTuple == 0: - print(f" m_pTuple: nullptr") + dump_pytuple(m_pTuple, argDict) + except Exception as e: + print(f" Tuple: (error: {e})") + + + except Exception as e: + print(f"Error: {e}") + +class DumpPythonFuncRec(gdb.Command): + """Dump a PythonFuncRec (internal class from PythonIntf.cc).""" + + def __init__(self): + super(DumpPythonFuncRec, self).__init__("dump-python-func-rec", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + + # Auto-cast: if val is a shared_ptr, extract + # _M_ptr and cast to PythonFuncRec* (the derived type). + val_type = val.type.strip_typedefs() + type_name = str(val_type) + if "shared_ptr" in type_name and "ExternalFunctionRec" in type_name: + ptr = val["_M_ptr"] + if ptr == 0: + print("PythonFuncRec: nullptr") + return + python_func_type = gdb.lookup_type("PythonFuncRec").pointer() + val = ptr.cast(python_func_type).dereference() + + print(f"PythonFuncRec @ {val.address}") + + # Access members + try: + m_name = std_string_to_str(val["m_name"]) + print(f" Name: {m_name}") + except Exception as e: + print(f" Name: (error: {e})") + + try: + m_pFuncPtr = val["m_pFuncPtr"] + print(f" Func Ptr: {m_pFuncPtr}") + except Exception as e: + print(f" Func Ptr: (error: {e})") + + # Show m_doc + try: + m_doc = std_string_to_str(val["m_doc"]) + if m_doc: + # Format multi-line docs nicely + if "\n" in m_doc: + print(f" doc:") + for line in m_doc.split("\n"): + print(f" {line}") + else: + print(f" doc: {m_doc}") else: - print(f" m_pTuple: {m_pTuple}") + print(f" doc: (empty)") + except Exception as e: + print(f" doc: (error: {e})") + + # Show m_argTypeExact + try: + m_argTypeExact = bool(val["m_argTypeExact"]) + print(" Arg Type: {}".format("exact" if m_argTypeExact else "no exactness information")) except Exception as e: - print(f" m_pTuple: (error: {e})") + print(f" Arg Type: (error: {e})") # Expand m_args vector + argDict = dict() try: m_args = val["m_args"] arg_size = std_vector_size(m_args) print(f" Args ({arg_size} items):") # Try to iterate and dump each argument + args_start = m_args["_M_impl"]["_M_start"] for i in range(arg_size): try: - arg_elem = m_args[i] + arg_elem = args_start[i] arg_name = std_string_to_str(arg_elem["m_name"]) arg_default = int(arg_elem["m_default"]) arg_default_str = ALU_COUNTER_TYPE.get(arg_default, f"Unknown({arg_default})") print(f" [{i}] {arg_name} (default: {arg_default_str})") + argDict[i] = arg_name # Show default value if present if arg_default == 1: # counterType__Str @@ -1419,7 +2021,15 @@ def invoke(self, arg, from_tty): except Exception as arg_e: print(f" [{i}] (error: {arg_e})") except Exception as e: - print(f" m_args: (error: {e})") + print(f" Args: (error: {e})") + + # Show m_pTuple (argument values for the current Python function call) + try: + m_pTuple = val["m_pTuple"] + dump_pytuple(m_pTuple, argDict) + except Exception as e: + print(f" Tuple: (error: {e})") + except Exception as e: print(f"Error: {e}") @@ -1440,26 +2050,26 @@ def invoke(self, arg, from_tty): m_default_str = ALU_COUNTER_TYPE.get(m_default, f"Unknown({m_default})") print(f"PythonFuncArg @ {val.address}") - print(f" m_name: {m_name}") - print(f" m_default: {m_default_str}") + print(f" Name: {m_name}") + print(f" Default Type: {m_default_str}") # Try to get default value if m_default == 1: # counterType__Str try: m_defStr = std_string_to_str(val["m_defStr"]) - print(f" m_defStr: \"{m_defStr}\"") + print(f" Default String: \"{m_defStr}\"") except: pass elif m_default == 2: # counterType__Int try: m_defInt = int(val["m_defInt"]) - print(f" m_defInt: {m_defInt}") + print(f" Default Int: {m_defInt}") except: pass elif m_default == 3: # counterType__Float try: m_defFloat = float(val["m_defFloat"]) - print(f" m_defFloat: {m_defFloat}") + print(f" Default Float: {m_defFloat}") except: pass except Exception as inner_e: @@ -1537,6 +2147,7 @@ def register_commands(): DumpExternalFunctionRec() DumpExternalFunctionCollection() DumpPythonFunctionCollection() + DumpPythonFuncByName() DumpPythonFuncRec() DumpPythonFuncArg() From 200b1be35a29d3666e59df03a3e707bcc98205db Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Mon, 25 May 2026 20:41:55 +0300 Subject: [PATCH 22/50] Documentation fixes (#391) --- specs/docs/debugging.md | 54 +++++++++++++++++++++++++-------------- specs/src/gdb/COMMANDS.md | 44 +++++++++++++++---------------- 2 files changed, 57 insertions(+), 41 deletions(-) diff --git a/specs/docs/debugging.md b/specs/docs/debugging.md index 9f923a6..047b579 100644 --- a/specs/docs/debugging.md +++ b/specs/docs/debugging.md @@ -21,7 +21,7 @@ To debug specs effectively, you must build with debug symbols enabled. ```bash cd specs/src -python setup.py -v DEBUG +python3 setup.py -v DEBUG make clean all ``` @@ -30,38 +30,30 @@ The `-v DEBUG` flag tells the setup script to enable debug symbols and disable o ### On Windows ```cmd -cd specs\src -python setup.py -v DEBUG -c VS msbuild specs\specs.sln /p:Configuration=Debug /p:Platform=x64 ``` +However, `gdb` is not normally the debugger that you use on Windows. + --- ## Loading the GDB Macros ### Automatic Loading (Recommended) -When you run GDB from the `specs/src/` directory, the `.gdbinit` file is automatically loaded: +When you run GDB from the `specs/src/` directory, the `.gdbinit` file is automatically loaded. So you can add the `specs.gdb` file to `.gdbinit`. Or you can specify it on the command line: ```bash cd specs/src -gdb ./specs +gdb ../exe/specs -x gdb/specs.gdb ``` -GDB will automatically source `gdb/specs.gdb`, which loads the Python extension and registers all dump commands. - ### Manual Loading -If you're running GDB from a different directory, you can manually load the macros: +If you're running GDB from a different directory, you can manually load the macros from within GDB: -```bash -gdb ./specs -x specs/src/gdb/specs.gdb ``` - -Or from within GDB: - -``` -(gdb) source specs/src/gdb/specs.gdb +(gdb) source gdb/specs.gdb ``` ### Verify Loading @@ -69,15 +61,39 @@ Or from within GDB: After loading, you should see a welcome message: ``` +specs GDB extension loaded successfully + ======================================== specs GDB debugging macros loaded ======================================== Available commands: - dump_pstate - Dump ProcessingState - dump_sb - Dump StringBuilder - dump_item - Dump Item (polymorphic) - ... +dump_pstate - Dump ProcessingState +dump_sb - Dump StringBuilder +dump_item - Dump Item (polymorphic) +dump_items - Dump itemGroup +dump_token - Dump Token +dump_alu_value - Dump ALUValue +dump_alu_counters - Dump ALUCounters +dump_alu_vec - Dump AluVec +dump_alu_function - Dump AluFunction +dump_external_func_rec - Dump ExternalFunctionRec +dump_python_func_collection - Dump PythonFunctionCollection +dump_python_func_by_name - Dump PythonFuncRec by name +dump_python_func_rec - Dump PythonFuncRec +dump_python_func_arg - Dump PythonFuncArg +dump_exception - Dump SpecsException + +Breakpoint helpers: +bp_apply - Break on all 9 Item subclass apply methods +bp_getstr - Break on InputPart::getStr +bp_compile - Break on itemGroup::Compile +bp_parseAluExpression - Break on parseAluExpression, where expressions are parsed +bp_pfc_initialize - Break on PythonFunctionCollection::Initialize, where the Python Function Collection is initialized +bp_func_setargvalue - Break on PythonFuncRec::setArgValue, where an argument for an external function is set +bp_func_call - Break on PythonFuncRec::Call, where an external function is invoked + +For more help, type: help dump-processing-state ``` --- diff --git a/specs/src/gdb/COMMANDS.md b/specs/src/gdb/COMMANDS.md index 00bb809..bb183fa 100644 --- a/specs/src/gdb/COMMANDS.md +++ b/specs/src/gdb/COMMANDS.md @@ -63,34 +63,34 @@ ## Python Interface Commands -|| Command | Alias | Description | -||---------|-------|-------------| -|| `dump-alu-function` | `dump_alu_function` | Dump an AluFunction (name, arg count, input dependency) | -|| `dump-external-function-rec` | `dump_external_func_rec` | Dump an ExternalFunctionRec (calls virtual methods GetArgCount/GetFuncPtr) | -|| `dump-external-function-collection` | `dump_external_func_collection` | Dump an ExternalFunctionCollection (initialization state) | -|| `dump-python-function-collection` | `dump_python_func_collection` | Dump a PythonFunctionCollection (registry state, function count, and function list) | -|| `dump-python-func-by-name` | `dump_python_func_by_name` | Dump a PythonFuncRec by looking it up in a collection by name | -|| `dump-python-func-rec` | `dump_python_func_rec` | Dump a PythonFuncRec (name, pointer, doc, and expanded argument list) | -|| `dump-python-func-arg` | `dump_python_func_arg` | Dump a PythonFuncArg (name, default type, and default value) | +| Command | Alias | Description | +|---------|-------|-------------| +| `dump-alu-function` | `dump_alu_function` | Dump an AluFunction (name, arg count, input dependency) | +| `dump-external-function-rec` | `dump_external_func_rec` | Dump an ExternalFunctionRec (calls virtual methods GetArgCount/GetFuncPtr) | +| `dump-external-function-collection` | `dump_external_func_collection` | Dump an ExternalFunctionCollection (initialization state) | +| `dump-python-function-collection` | `dump_python_func_collection` | Dump a PythonFunctionCollection (registry state, function count, and function list) | +| `dump-python-func-by-name` | `dump_python_func_by_name` | Dump a PythonFuncRec by looking it up in a collection by name | +| `dump-python-func-rec` | `dump_python_func_rec` | Dump a PythonFuncRec (name, pointer, doc, and expanded argument list) | +| `dump-python-func-arg` | `dump_python_func_arg` | Dump a PythonFuncArg (name, default type, and default value) | ## Utility Commands -|| Command | Alias | Description | -||---------|-------|-------------| -|| `dump-exception` | `dump_exception` | Dump a SpecsException | -|| `dump-all` | — | Dump all relevant debugging info | +| Command | Alias | Description | +|---------|-------|-------------| +| `dump-exception` | `dump_exception` | Dump a SpecsException | +| `dump-all` | — | Dump all relevant debugging info | ## Breakpoint Helpers -|| Command | Description | -||---------|-------------| -|| `bp_apply` | Set breakpoints on all 9 Item subclass apply methods | -|| `bp_getstr` | Set breakpoint on InputPart::getStr | -|| `bp_compile` | Set breakpoint on itemGroup::Compile | -|| `bp_parseAluExpression` | Set breakpoint on parseAluExpression, where expressions are parsed | -|| `bp_pfc_initialize` | Set breakpoint on PythonFunctionCollection::Initialize, where the Python Function Collection is initialized | -|| `bp_func_setargvalue` | Set breakpoint on PythonFuncRec::setArgValue, where an argument for an external function is set | -|| `bp_func_call` | Set breakpoint on PythonFuncRec::Call, where an external function is invoked | +| Command | Description | +|---------|-------------| +| `bp_apply` | Set breakpoints on all 9 Item subclass apply methods | +| `bp_getstr` | Set breakpoint on InputPart::getStr | +| `bp_compile` | Set breakpoint on itemGroup::Compile | +| `bp_parseAluExpression` | Set breakpoint on parseAluExpression, where expressions are parsed | +| `bp_pfc_initialize` | Set breakpoint on PythonFunctionCollection::Initialize, where the Python Function Collection is initialized | +| `bp_func_setargvalue` | Set breakpoint on PythonFuncRec::setArgValue, where an argument for an external function is set | +| `bp_func_call` | Set breakpoint on PythonFuncRec::Call, where an external function is invoked | ## Usage Examples From 5010976a306d4df27a18e43922e9b7a683a09c71 Mon Sep 17 00:00:00 2001 From: niry1 Date: Tue, 26 May 2026 11:08:15 +0300 Subject: [PATCH 23/50] Improve GDB macros and docs for Python functions --- specs/docs/debugging.md | 164 +++++++++++++++++++++++-------------- specs/src/gdb/COMMANDS.md | 20 ++--- specs/src/gdb/specs.gdb | 39 ++++----- specs/src/gdb/specs_gdb.py | 34 +++++++- 4 files changed, 157 insertions(+), 100 deletions(-) diff --git a/specs/docs/debugging.md b/specs/docs/debugging.md index 047b579..08b042c 100644 --- a/specs/docs/debugging.md +++ b/specs/docs/debugging.md @@ -85,13 +85,15 @@ dump_python_func_arg - Dump PythonFuncArg dump_exception - Dump SpecsException Breakpoint helpers: -bp_apply - Break on all 9 Item subclass apply methods -bp_getstr - Break on InputPart::getStr -bp_compile - Break on itemGroup::Compile -bp_parseAluExpression - Break on parseAluExpression, where expressions are parsed -bp_pfc_initialize - Break on PythonFunctionCollection::Initialize, where the Python Function Collection is initialized -bp_func_setargvalue - Break on PythonFuncRec::setArgValue, where an argument for an external function is set -bp_func_call - Break on PythonFuncRec::Call, where an external function is invoked +bp_apply - Break on all 9 Item subclass apply methods +bp_getstr - Break on InputPart::getStr +bp_compile - Break on itemGroup::Compile +bp_parseAluExpression - Break on parseAluExpression, where expressions are parsed +bp_pyfuncs - Break on the imoprtant functions related to Python functions: +. - PythonFunctionCollection::Initialize, where the Python Function Collection is initialized +. - PythonFunctionCollection::GetFunctionByName where the function record is retrieved based on name +. - PythonFuncRec::setArgValue, where an argument for an external function is set +. - PythonFuncRec::Call, where an external function is invoked For more help, type: help dump-processing-state ``` @@ -159,22 +161,25 @@ For more help, type: help dump-processing-state ### Python Interface Commands -|| Command | Purpose | -||---------|----------| -|| `dump_alu_function ` | Dump an AluFunction (name, arg count, input dependency) | -|| `dump_external_func_rec ` | Dump an ExternalFunctionRec (polymorphic base class) | -|| `dump_external_func_collection ` | Dump an ExternalFunctionCollection (initialization state) | -|| `dump_python_func_collection ` | Dump a PythonFunctionCollection (internal Python function registry) | -|| `dump_python_func_rec ` | Dump a PythonFuncRec (Python function record with name and args) | -|| `dump_python_func_arg ` | Dump a PythonFuncArg (function argument with default value) | +| Command | Purpose | +|---------|----------| +| `dump_alu_function ` | Dump an AluFunction (name, arg count, input dependency) | +| `dump_external_func_rec ` | Dump an ExternalFunctionRec (polymorphic base class) | +| `dump_external_func_collection ` | Dump an ExternalFunctionCollection (initialization state) | +| `dump_python_func_collection ` | Dump a PythonFunctionCollection (internal Python function registry) | +| `dump_python_func_rec ` | Dump a PythonFuncRec (Python function record with name and args) | +| `dump_python_func_by_name ` | Dump a PythonFuncRec (Python function record with name and args) by collection and function name | +| `dump_python_func_arg ` | Dump a PythonFuncArg (function argument with default value) | ### Breakpoint Helpers -| Command | Purpose | -|---------|---------| -| `bp_apply` | Set breakpoint on Item::apply | -| `bp_getstr` | Set breakpoint on InputPart::getStr | -| `bp_compile` | Set breakpoint on itemGroup::Compile | +| Command | Description | +|---------|-------------| +| `bp_apply` | Set breakpoints on all 9 `::apply` methods of `Item` subclasses | +| `bp_getstr` | Set breakpoint on `InputPart::getStr` | +| `bp_compile` | Set breakpoint on `itemGroup::Compile` | +| `bp_parseAluExpression` | Set breakpoint on `parseAluExpression`, where expressions are parsed | +| `bp_pyfuncs` | Set breakpoints on all Python function-related methods. This includes `PythonFunctionCollection::Initialize`, where the Python Function Collection is initialized, `PythonFunctionCollection::GetFunctionByName` where the function record is retrieved based on name, `PythonFuncRec::setArgValue`, where an argument for an external function is set, and `PythonFuncRec::Call`, where an external function is invoked | --- @@ -185,30 +190,53 @@ For more help, type: help dump-processing-state Suppose you're debugging a spec that processes records and you want to see the current state: ``` -(gdb) break Item::apply +(gdb) bp_apply Breakpoint 1 at 0x... (gdb) run < input.txt Starting program: ./specs ... -Breakpoint 1, Item::apply (this=0x..., pState=0x..., pSB=0x...) at specitems/specItems.cc:... +Breakpoint 2, DataField::apply (this=0x5fb8b0, pState=..., pSB=0x7fffffffcf50) at specitems/dataField.cc:444 (gdb) dump_pstate pState -ProcessingState @ 0x7fffffffde00 - Current Record: "hello world" - Previous Record: "goodbye world" +ProcessingState @ 0x7fffffffd070 + Current Record: "How am I doing?" + Previous Record: "well, hello there" + Input Record: "How am I doing?" Pad Char: ' ' (0x20) - Word Separator: " " - Field Separator: "\t" - Cycle Counter: 42 + Word Separator: "" (local) + Field Separator: " " + Cycle Counter: 2 Extra Reads: 0 - Record Count: 42 - Word Count: 2 - Field Count: 1 - Input Station: -1 + Record Count: 2 + Context Offset: 0 + Word Count: -1 + Field Count: -1 + Word Positions (3 cached): + [0] 1-5 + [1] 7-11 + [2] 13-17 + Field Positions (0 cached): + (none) + Field Identifiers (0 entries): + (none) + FI Statistics (0 entries): + (none) + Break Values (1 entries): + 'a' = "well," + Break Level: (none) + Frequency Maps (0 entries): + (none) + Conditions (0 deep): + (empty) + Loops (0 deep): + (empty) + Input Station: FIRST Input Stream: 1 + Stream Changed: False + Writers: 0x7fffffffcfe0 Output Index: 1 - No Write: false - EOF: false + No Write: False + EOF: False ``` This shows you exactly what the current record is, how many times we've processed records, and the current separators. @@ -219,13 +247,19 @@ When debugging a data field specification: ``` (gdb) dump_data_field pDataField -DataField @ 0x... - m_label: A - m_outStart: 10 - m_maxLength: 20 - m_strip: true - m_conversion: UCASE - m_alignment: Left +Item @ 0x5fb8b0 + Original Index: 1 + readsLines: False + producesOutput: False + forcesRunoutCycle: False + isBreak: False +----- end of 'Item' dump + Label: a + Output Start: 10 + Max Length: 20 + Strip: True + Conversion: UCASE + Alignment: Left ``` This tells you that the field is labeled 'A', outputs starting at column 10, has a max length of 20 characters, strips whitespace, converts to uppercase, and is left-aligned. @@ -252,13 +286,13 @@ Then you can inspect individual items: ``` (gdb) dump_item pItemGroup.m_items[0] -Item @ 0x... - m_originalIndex: 0 - Debug: {Source=Range[1:10];Dest=@10L20} - readsLines: true - producesOutput: true - forcesRunoutCycle: false - isBreak: false +Item @ 0x5fb410 + Original Index: 1 + readsLines: False + producesOutput: False + forcesRunoutCycle: False + isBreak: False +----- end of 'Item' dump ``` ### Example 4: Examining ALU Expressions @@ -267,15 +301,16 @@ When debugging expression evaluation: ``` (gdb) dump_alu_value myALUValue -ALUValue @ 0x... - m_type: Int - m_value: "42" - m_exact: true +ALUValue @ 0x5fdc90 + Type: Int + Value: "42" + Exact: True (gdb) dump_alu_counters g_counters -ALUCounters @ 0x... - Counters (map): - m_map @ 0x... +ALUCounters @ 0x5e2c60 + Counters (2 entries): + #1: (int) 117 (exact) + #2: (float) -0.019522002761880094 ``` ### Example 5: Conditional Breakpoints with Cycle Counter @@ -283,12 +318,12 @@ ALUCounters @ 0x... To break only on a specific record number: ``` -(gdb) break Item::apply if pState.m_CycleCounter == 100 +(gdb) break DataField::apply if pState.m_CycleCounter == 100 Breakpoint 1 at 0x... (gdb) run < input.txt ... -Breakpoint 1, Item::apply (this=0x..., pState=0x..., pSB=0x...) at specitems/specItems.cc:... +Breakpoint 1, DataField::apply (this=0x..., pState=0x..., pSB=0x...) at specitems/specItems.cc:... (gdb) dump_pstate pState ProcessingState @ 0x... @@ -300,15 +335,18 @@ This is useful for debugging issues that only occur on specific records. ### Example 6: Debugging Python Function Integration -When debugging Python function calls and integration: +When debugging Python function calls and integration, there is one macro that sets most of the important python-related breakpoints: ``` -(gdb) break PythonIntf.cc:167 -Breakpoint 1 at 0x... +(gdb) bp_pyfuncs +Breakpoint 2 at 0x50e8ce: file utils/PythonIntf.cc, line 351. +Breakpoint 3 at 0x51118c: file utils/PythonIntf.cc, line 629. +Breakpoint 4 at 0x50ce71: file utils/PythonIntf.cc, line 112. +Breakpoint 5 at 0x50d3c4: file utils/PythonIntf.cc, line 170. -(gdb) run -f myspec.txt < input.txt -... -Breakpoint 1, PyObject_CallObject (...) at PythonIntf.cc:167 +(gdb) run -f myspec < input.txt + +Breakpoint 2, PythonFunctionCollection::Initialize (this=0x5e6da0 , _path=0x5e5fd0 "/home/sio/specs") at utils/PythonIntf.cc:351 (gdb) dump_python_func_collection gFunctionCollection PythonFunctionCollection @ 0x... diff --git a/specs/src/gdb/COMMANDS.md b/specs/src/gdb/COMMANDS.md index bb183fa..d4276f3 100644 --- a/specs/src/gdb/COMMANDS.md +++ b/specs/src/gdb/COMMANDS.md @@ -84,13 +84,11 @@ | Command | Description | |---------|-------------| -| `bp_apply` | Set breakpoints on all 9 Item subclass apply methods | -| `bp_getstr` | Set breakpoint on InputPart::getStr | -| `bp_compile` | Set breakpoint on itemGroup::Compile | -| `bp_parseAluExpression` | Set breakpoint on parseAluExpression, where expressions are parsed | -| `bp_pfc_initialize` | Set breakpoint on PythonFunctionCollection::Initialize, where the Python Function Collection is initialized | -| `bp_func_setargvalue` | Set breakpoint on PythonFuncRec::setArgValue, where an argument for an external function is set | -| `bp_func_call` | Set breakpoint on PythonFuncRec::Call, where an external function is invoked | +| `bp_apply` | Set breakpoints on all 9 `::apply` methods of `Item` subclasses | +| `bp_getstr` | Set breakpoint on `InputPart::getStr` | +| `bp_compile` | Set breakpoint on `itemGroup::Compile` | +| `bp_parseAluExpression` | Set breakpoint on `parseAluExpression`, where expressions are parsed | +| `bp_pyfuncs` | Set breakpoints on all Python function-related methods. This includes `PythonFunctionCollection::Initialize`, where the Python Function Collection is initialized, `PythonFunctionCollection::GetFunctionByName` where the function record is retrieved based on name, `PythonFuncRec::setArgValue`, where an argument for an external function is set, and `PythonFuncRec::Call`, where an external function is invoked | ## Usage Examples @@ -121,10 +119,10 @@ Item @ 0x... ### Dump ALUValue ```gdb (gdb) dump_alu_value myValue -ALUValue @ 0x... - m_type: Int - m_value: "42" - m_exact: true +ALUValue @ 0x5fdc30 + Type: Int + Value: "17" + Exact: True ``` ### Set Breakpoints on Item::apply diff --git a/specs/src/gdb/specs.gdb b/specs/src/gdb/specs.gdb index 954ba38..b124feb 100644 --- a/specs/src/gdb/specs.gdb +++ b/specs/src/gdb/specs.gdb @@ -237,25 +237,18 @@ document bp_context_apply Set a breakpoint on ContextItem::apply to debug rolling context operations. end -define bp_pfc_initialize +define bp_pyfuncs break PythonFunctionCollection::Initialize -end -document bp_pfc_initialize - Set a breakpoint on PythonFunctionCollection::Initialize to debug the initialization of the Python Function Collection. -end - -define bp_func_setargvalue + break PythonFunctionCollection::GetFunctionByName break PythonFuncRec::setArgValue -end -document bp_func_setargvalue - Set a breakpoint on PythonFuncRec::setArgValue to debug setting external function arguments. -end - -define bp_func_call break PythonFuncRec::Call end -document bp_func_call - Set a breakpoint on PythonFuncRec::Call to debug calling external functions. +document bp_pyfuncs + Set breakpoints on all Python function-related methods. This includes: + - PythonFunctionCollection::Initialize, where the Python Function Collection is initialized + - PythonFunctionCollection::GetFunctionByName where the function record is retrieved based on name + - PythonFuncRec::setArgValue, where an argument for an external function is set + - PythonFuncRec::Call, where an external function is invoked end # ============================================================================ @@ -295,13 +288,15 @@ echo dump_python_func_arg - Dump PythonFuncArg\n echo dump_exception - Dump SpecsException\n echo \n echo Breakpoint helpers:\n -echo bp_apply - Break on all 9 Item subclass apply methods\n -echo bp_getstr - Break on InputPart::getStr\n -echo bp_compile - Break on itemGroup::Compile\n -echo bp_parseAluExpression - Break on parseAluExpression, where expressions are parsed\n -echo bp_pfc_initialize - Break on PythonFunctionCollection::Initialize, where the Python Function Collection is initialized\n -echo bp_func_setargvalue - Break on PythonFuncRec::setArgValue, where an argument for an external function is set\n -echo bp_func_call - Break on PythonFuncRec::Call, where an external function is invoked\n +echo bp_apply - Break on all 9 Item subclass apply methods\n +echo bp_getstr - Break on InputPart::getStr\n +echo bp_compile - Break on itemGroup::Compile\n +echo bp_parseAluExpression - Break on parseAluExpression, where expressions are parsed\n +echo bp_pyfuncs - Break on the imoprtant functions related to Python functions:\n +echo . - PythonFunctionCollection::Initialize, where the Python Function Collection is initialized\n +echo . - PythonFunctionCollection::GetFunctionByName where the function record is retrieved based on name\n +echo . - PythonFuncRec::setArgValue, where an argument for an external function is set\n +echo . - PythonFuncRec::Call, where an external function is invoked\n echo \n echo For more help, type: help dump-processing-state\n echo \n diff --git a/specs/src/gdb/specs_gdb.py b/specs/src/gdb/specs_gdb.py index 1da5c5d..708c0ba 100644 --- a/specs/src/gdb/specs_gdb.py +++ b/specs/src/gdb/specs_gdb.py @@ -827,6 +827,7 @@ def invoke(self, arg, from_tty): self.dump_item.invoke(arg, from_tty) # Then print derived class fields + print(f"DataField:") val = gdb.parse_and_eval(arg) label = chr(int(val["m_label"])) if int(val["m_label"]) > 0 else "none" out_start = int(val["m_outStart"]) @@ -862,6 +863,7 @@ def invoke(self, arg, from_tty): self.dump_item.invoke(arg, from_tty) # Then print derived class fields + print(f"TokenItem:") val = gdb.parse_and_eval(arg) token = deref_shared_ptr(val["mp_Token"]) if token: @@ -888,6 +890,7 @@ def invoke(self, arg, from_tty): self.dump_item.invoke(arg, from_tty) # Then print derived class fields + print(f"SetItem:") val = gdb.parse_and_eval(arg) raw_expr = std_string_to_str(val["m_rawExpression"]) key = int(val["m_key"]) @@ -911,6 +914,7 @@ def invoke(self, arg, from_tty): self.dump_item.invoke(arg, from_tty) # Then print derived class fields + print(f"SkipItem:") val = gdb.parse_and_eval(arg) raw_expr = std_string_to_str(val["m_rawExpression"]) is_until = bool(val["m_bIsUntil"]) @@ -937,6 +941,7 @@ def invoke(self, arg, from_tty): self.dump_item.invoke(arg, from_tty) # Then print derived class fields + print(f"ConditionItem:") val = gdb.parse_and_eval(arg) pred = int(val["m_pred"]) pred_str = CONDITION_PREDICATE.get(pred, f"Unknown({pred})") @@ -963,6 +968,7 @@ def invoke(self, arg, from_tty): self.dump_item.invoke(arg, from_tty) # Then print derived class fields + print(f"BreakItem:") val = gdb.parse_and_eval(arg) ident = chr(int(val["m_identifier"])) print(f" Identifier: {ident}") @@ -984,6 +990,7 @@ def invoke(self, arg, from_tty): self.dump_item.invoke(arg, from_tty) # Then print derived class fields + print(f"ContextItem:") val = gdb.parse_and_eval(arg) offset = int(val["m_offset"]) print(f" Offset: {offset}") @@ -1005,6 +1012,7 @@ def invoke(self, arg, from_tty): self.dump_item.invoke(arg, from_tty) # Then print derived class fields + print(f"SelectItem:") val = gdb.parse_and_eval(arg) stream = int(val["m_stream"]) b_output = bool(val["bOutput"]) @@ -1028,6 +1036,7 @@ def invoke(self, arg, from_tty): self.dump_item.invoke(arg, from_tty) # Then print derived class fields + print(f"SplitItem:") val = gdb.parse_and_eval(arg) is_field = bool(val["m_isField"]) sep = std_string_to_str(val["m_separator"]) @@ -1521,11 +1530,28 @@ def __init__(self): def invoke(self, arg, from_tty): try: val = gdb.parse_and_eval(arg) - # m_map is a std::map + m_map = val["m_map"] + size = std_map_size(m_map) print(f"ALUCounters @ {val.address}") - print(f" Counters (map):") - # Simplified: just show the address - print(f" Map @ {val['m_map'].address}") + print(f" Counters ({size} entries):") + if size == 0: + print(f" (empty)") + else: + items = std_map_items(m_map) + for key, value in items: + try: + k = int(key) + type_val = int(value["m_type"]) + type_str = ALU_COUNTER_TYPE.get(type_val, f"Unknown({type_val})").lower() + val_str = std_string_to_str(value["m_value"]) + exact = bool(value["m_exact"]) + exact_str = " (exact)" if exact else "" + if type_val == 0: # counterType__None + print(f" #{k}: (none)") + else: + print(f" #{k}: ({type_str}) {val_str}{exact_str}") + except Exception as item_e: + print(f" (error: {item_e})") except Exception as e: print(f"Error: {e}") From 099af1052c2ee7065f1322e296bbc63d71014676 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Wed, 27 May 2026 16:37:33 +0300 Subject: [PATCH 24/50] Better access to rolling context: @! and cfrecord --- manpage | 26 +++++++ specs/docs/alu.md | 10 ++- specs/docs/alu_adv.md | 3 +- specs/docs/streams.md | 99 +++++++++++++++++++++++++- specs/src/gdb/specs_gdb.py | 62 ++++++++++++---- specs/src/processing/ProcessingState.h | 1 + specs/src/specitems/specItems.cc | 2 +- specs/src/test/ProcessingTest.cc | 36 ++++++++++ specs/src/utils/alu.cc | 19 ++++- specs/src/utils/aluFunctions.cc | 12 +++- specs/src/utils/aluFunctions.h | 5 +- specs/tests/valgrind_unit_tests.py | 2 +- 12 files changed, 255 insertions(+), 22 deletions(-) diff --git a/manpage b/manpage index 65be818..e4e339e 100644 --- a/manpage +++ b/manpage @@ -736,6 +736,23 @@ and .B @-0 are equivalent to .B @@. +The +.B @! +syntax substitutes for the current record as affected by +.B CONTEXT. +Without +.B CONTEXT +in effect, +.B @! +is equivalent to +.B @@. +When +.B CONTEXT +is active, +.B @@ +still returns the original input record, while +.B @! +returns the context-affected record. Out-of-bounds offsets return the empty string. The forward and backward buffer sizes are computed automatically at parse time. This feature requires non-threaded mode and a single input stream. .SS "Built-In Functions" @@ -924,6 +941,15 @@ is active, returns the same value as .IP "record()" 3 Returns the entire input record. Equivalent to .B range(1,-1) +.IP "cfrecord()" 3 +Returns the entire input record, disregarding rolling context. When +.I CONTEXT +is active, +.B record() +returns the context-affected record, while +.B cfrecord() +returns the original input record. Equivalent to +.B @@. .IP "range(s,e)" 3 Returns the range of characters from .B s diff --git a/specs/docs/alu.md b/specs/docs/alu.md index fc81e9c..f88d765 100644 --- a/specs/docs/alu.md +++ b/specs/docs/alu.md @@ -117,7 +117,14 @@ POSIX (darwin) system using the g++ compiler and Python 3.9.6 - release variatio ``` Others are `@cols`, which contains the number of columns in the terminal screen, and `@rows`, which contains the number of rows on that same screen. -Additionally, the `@@` string stands for the entire input record. +Additionally, the `@@` string stands for the entire input record. When rolling context is in effect (see [Streams and Records](streams.md#rolling-context)), `@@` always refers to the original input record. The `@!` string refers to the current record as affected by `CONTEXT`, which is the same as `@@` when no `CONTEXT` is active. The `@-n` and `@+n` syntax is an alternative to using that is effective within expressions. The following three specifications are equivalent: + +``` +# Using @@ syntax # Using the CONTEXT keyword # No expression - just data fields +PRINT @@ 1 WRITE PRINT @! 1 WRITE 1-* 1 WRITE +PRINT @+1 1 WRITE CONTEXT +1 PRINT @! 1 WRITE CONTEXT +1 1-* 1 WRITE +PRINT @+2 1 WRITE CONTEXT +2 PRINT @! 1 WRITE CONTEXT +2 1-* 1 WRITE +``` `@python` contains either "Enabled" or "Disabled" depending on whether python function support is enabled. @@ -143,6 +150,7 @@ A full list of supported operators can be found in [Advanced ALU Topics](alu_adv The specs ALU has a bunch of built-in functions. The full list is available at [Advanced ALU Topics](alu_adv.md), but here are a few examples: * len(x) - returns the length of x considered as a string * record() - returns the entire input record +* cfrecord() - returns the entire input record, disregarding rolling context * words(start, count) - returns a substring of the input record, similar to what `words start.count` would yield in a data field. * tf2mcs(s,f) and mcs2tf(x,f) - convert a formatted date string to the internal representation, which is measured in microseconds since the Unix epoch (1-Jan-1970 at midnight), and convert the other way. The format is similar to that of the C function strftime(), plus %xf for fractional seconds, where x represents number of digits from 0 to 6. * pos(needle,haystack) diff --git a/specs/docs/alu_adv.md b/specs/docs/alu_adv.md index e81562e..c7b03aa 100644 --- a/specs/docs/alu_adv.md +++ b/specs/docs/alu_adv.md @@ -165,7 +165,8 @@ All three regular expression functions have an argument called `matchFlags`. Thi | `range(n,m)` | Returns the substring from the *n*-th character (default first) to the *m*-th character (default last) | | `recno()` | Returns the number of the currently read record. If the `READ` or `READSTOP` keywords are used this may be greater than `number()` | | `ctxrecno()` | Returns the record number of the record that input parts work on. This is similar to `recno()`, but considers rolling context, which `recno()` does not. | -| `record()` | Returns the entire input record | +| `record()` | Returns the entire input record. Equivalent to `@!`. | +| `cfrecord()` | Returns the entire input record, disregarding rolling context. Same as `record()` when `CONTEXT` is not in effect. Equivalent to `@@`. | | `word(n)` | Returns the *n*-th word | | `wordrange(n,m)` | Returns the substring from the *n*-th word (default first) to the *m*-th word (default last) | | `wordcount(s,p)` | Returns the number of words in the string `s`, or in the current record if `s` is not specified. The separator used is `p`. If `p` is not specified, the separator is the current word separator if processing the current record, or a **blank space** if processing `s`. | diff --git a/specs/docs/streams.md b/specs/docs/streams.md index cdc422d..f4ae33e 100644 --- a/specs/docs/streams.md +++ b/specs/docs/streams.md @@ -94,7 +94,7 @@ e6d7f9ac591379d653a5685f9d75deccc1792545 synp71 1548011387.000000 ## Pushing Back The Last Record That specification in the previous section reads several lines in a `WHILE` loop searching for the line we need for the next iteration. This is a common pattern and we were forced to use a variable to transfer the content of the next commit record to the next iteration. -**specs** version 0.3 introduces the `UNREAD` spec unit. What it does is push back the current read record so that it is possible to process it as the first record of the next iteration. The specification above can thus be simplified as follows: +The `UNREAD` spec unit pushes back the current read record so that it is possible to process it as the first record of the next iteration. The specification above can thus be simplified as follows: ``` specs WORD 2 1 @@ -247,6 +247,103 @@ A few things to note: 1. `READ` and `READSTOP` **MUST NOT** be used during secondary reading. This will result in an error. 1. Specifications should not mix `READ` and `READSTOP` with `SELECT SECOND` even if the `READ` or `READSTOP` is during reading of the primary record. The results are undefined and may change in future releases. +## Rolling Context +The `SELECT SECOND` mechanism described above lets us peek one record ahead. But what if we need to look further ahead, or look *behind* at records we've already seen? The `CONTEXT` spec unit provides a general way to do this. + +`CONTEXT` takes a single integer argument -- a positive number to look forward, a negative number to look backward, or zero to reset to the current record. When **specs** encounters a `CONTEXT` spec unit, it changes the active input record to the one at the given offset from the current record. Any input parts that follow will read from that record instead of the current one. + +Consider the following input: +``` +alpha +beta +gamma +``` +And use the following specification: +``` +specs 1-* 1 CONTEXT 1 1-* NEXTWORD +``` +The output is: +``` +alpha beta +beta gamma +gamma +``` +On the first cycle, the current record is `alpha` and `CONTEXT 1` peeks one record ahead to `beta`. On the second cycle, the current record is `beta` and `CONTEXT 1` peeks ahead to `gamma`. On the third cycle, there is no record after `gamma`, so the context record is empty. + +Looking backward works the same way: +``` +specs 1-* 1 CONTEXT -1 1-* NEXTWORD +``` +produces: +``` +alpha +beta alpha +gamma beta +``` +On the first cycle there is no previous record, so the context record is empty. On later cycles we get the previous record. + +Multiple `CONTEXT` tokens can appear in a single specification, and `CONTEXT 0` resets to the current record: +``` +specs CONTEXT 1 WORD 1 1 CONTEXT 0 WORD 1 NEXTWORD +``` +Given the same input, the output is: +``` +beta alpha +gamma beta +gamma +``` +The first column comes from `WORD 1` while the *next* record is selected, and the second column comes from `WORD 1` after `CONTEXT 0` resets back to the current record. + +### Context in Expressions +In addition to the `CONTEXT` spec unit, **specs** supports the `@+n` and `@-n` syntax in expressions, where *n* is a non-negative integer. These evaluate to the full content of the record at the given offset: +``` +specs PRINT "length(@+1)" 1 +``` +Given the input `AB`, `CDE`, `F`, this outputs `3`, `1`, `0` -- the length of the *next* record in each cycle. + +Note that `@@` (the current input record) and `@+0` or `@-0` are not quite the same thing when `CONTEXT` is also used: `@@` always returns the real input record, regardless of any `CONTEXT` that may be in effect. To get the context-affected record in an expression, use `@!`: +``` +specs CONTEXT 1 PRINT "@!" 1 WRITE PRINT "@@" 1 WRITE +``` +Given the input `alpha`, `beta`, `gamma`, the output is: +``` +beta +alpha +gamma +beta + +gamma +``` +The first line of each pair comes from `@!` (the context-affected record -- one ahead), while the second comes from `@@` (the original input record). Without `CONTEXT`, `@!` and `@@` are equivalent. + +Similarly, the `record()` function returns the context-affected record, while the `cfrecord()` function always returns the original input record regardless of any `CONTEXT` that may be in effect. + +### The ctxrecno() Function +The `ctxrecno()` function returns the record number that the context record *would* have if it were the current record. Without any `CONTEXT` in effect, `ctxrecno()` is the same as `recno()`. With `CONTEXT 1`, `ctxrecno()` returns `recno() + 1`, and so on: +``` +specs PRINT "ctxrecno()" 1 CONTEXT 1 PRINT "ctxrecno()" NEXTWORD +``` +Given three input records, the output is: +``` +1 2 +2 3 +3 4 +``` + +### How It Works +**specs** determines the maximum forward and backward offsets at compile time and uses them to maintain a sliding window of records around the current one. Records are read ahead into a forward buffer, and past records are kept in a backward buffer. This means that a specification using `CONTEXT 3` will read three records ahead before processing begins. + +When verbose mode (`-v`) is enabled, **specs** reports the buffer sizes: +``` +specs: Using a 3-record rolling context: 2 records forward and 1 records backward. +``` + +If the context offset refers to a record that does not exist (before the first record or past the last), the context record is empty. + +### Restrictions +1. Rolling context is not supported with threading (`-j` flag). +1. Rolling context is not supported with multiple input streams. + ## Multiple Input Streams **specs** allows you to use multiple input streams in your specifications. The way this works is that you use the `--is2` to `--is8` CLI switches to specify additional (up to a total of 8) input streams to use. At each cycle of the specification, 1 record is read from each input stream, which implies that the number of records in each stream should be equal. diff --git a/specs/src/gdb/specs_gdb.py b/specs/src/gdb/specs_gdb.py index 708c0ba..20f7b81 100644 --- a/specs/src/gdb/specs_gdb.py +++ b/specs/src/gdb/specs_gdb.py @@ -1054,6 +1054,39 @@ def invoke(self, arg, from_tty): # DUMP COMMANDS - itemGroup # ============================================================================ +def _dump_item_detail(ptr_val, index, indent=" "): + """ + Given an Item* (as a gdb.Value integer pointer), print a detailed + one-line summary including the dynamic type name and Debug() output. + """ + if int(ptr_val) == 0: + print(f"{indent}[{index}] ") + return + + # Determine the dynamic type name via RTTI + type_name = "Item" + try: + item_obj = ptr_val.dereference() + dyn_type = item_obj.dynamic_type + type_name = dyn_type.name + except: + pass + + # Call the virtual Debug() method for a type-specific description + debug_str = None + try: + result = gdb.parse_and_eval( + f'((Item*)({int(ptr_val)}))->Debug()') + debug_str = std_string_to_str(result) + except: + pass + + if debug_str: + print(f"{indent}[{index}] ({type_name}) {debug_str}") + else: + print(f"{indent}[{index}] ({type_name}) @ {ptr_val}") + + class DumpItemGroup(gdb.Command): """Dump an itemGroup.""" @@ -1064,6 +1097,7 @@ def invoke(self, arg, from_tty): try: val = gdb.parse_and_eval(arg) need_runout = bool(val["bNeedRunoutCycle"]) + need_runout_from_start = bool(val["bNeedRunoutCycleFromStart"]) found_second = bool(val["bFoundSelectSecond"]) # Get m_items vector @@ -1072,20 +1106,22 @@ def invoke(self, arg, from_tty): print(f"itemGroup @ {val.address}") print(f" Need Runout Cycle: {need_runout}") + print(f" From start: {need_runout_from_start}") print(f" Found Select Second: {found_second}") print(f" Item count: {item_count}") print(f" Items:") - # Try to iterate items (simplified) - for i in range(min(item_count, 10)): # Limit to first 10 + max_display = 50 + for i in range(min(item_count, max_display)): try: - item = items_vec["_M_impl"]["_M_start"][i] - print(f" [{i}] @ {item.address}") - except: - pass - - if item_count > 10: - print(f" ... and {item_count - 10} more items") + shared_ptr = items_vec["_M_impl"]["_M_start"][i] + ptr = shared_ptr["_M_ptr"] + _dump_item_detail(ptr, i) + except Exception: + print(f" [{i}] ") + + if item_count > max_display: + print(f" ... and {item_count - max_display} more items") except Exception as e: print(f"Error: {e}") @@ -1168,9 +1204,11 @@ def invoke(self, arg, from_tty): print(f"ProcessingState @ {val.address}") # --- Records --- - print(f" Current Record: {self._fmt_record(val['m_ps'])}") + print(f" Current Record: {self._fmt_record(val['m_ps'])} (CONTEXT-dependent)") print(f" Previous Record: {self._fmt_record(val['m_prevPs'])}") - print(f" Input Record: {self._fmt_record(val['m_inputRecord'])}") + print(f" Input Record: {self._fmt_record(val['m_inputRecord'])} (CONTEXT-independent)") + context_offset = int(val["m_contextOffset"]) + print(f" Context Offset: {context_offset}") # --- Separators & Padding --- try: @@ -1197,11 +1235,9 @@ def invoke(self, arg, from_tty): try: cycle = int(val["m_CycleCounter"]) extra_reads = int(val["m_ExtraReads"]) - context_offset = int(val["m_contextOffset"]) print(f" Cycle Counter: {cycle}") print(f" Extra Reads: {extra_reads}") print(f" Record Count: {cycle + extra_reads}") - print(f" Context Offset: {context_offset}") except Exception as e: print(f" Counters: (error: {e})") diff --git a/specs/src/processing/ProcessingState.h b/specs/src/processing/ProcessingState.h index b25b48b..1d84061 100644 --- a/specs/src/processing/ProcessingState.h +++ b/specs/src/processing/ProcessingState.h @@ -51,6 +51,7 @@ class ProcessingState : public stateQueryAgent { PSpecString getFromTo(int from, int to) override; bool isRunIn() override { return (m_CycleCounter==1); } bool isRunOut() override { return (m_ps==nullptr); } // NOTE: will return true before first record + bool isEOF() override { return m_bEOF; } ALUInt getRecordCount() override { return ALUInt(m_CycleCounter + m_ExtraReads); } ALUInt getContextOffset() override { return ALUInt(m_contextOffset); } ALUInt getIterationCount() override { return ALUInt(m_CycleCounter); } diff --git a/specs/src/specitems/specItems.cc b/specs/src/specitems/specItems.cc index ae74bd8..f708536 100644 --- a/specs/src/specitems/specItems.cc +++ b/specs/src/specitems/specItems.cc @@ -927,7 +927,7 @@ ApplyRet TokenItem::apply(ProcessingState& pState, StringBuilder* pSB) case TokenListType__WRITE: return ApplyRet__Write; case TokenListType__EOF: - return ApplyRet__EOF; + return pState.isEOF() ? ApplyRet__Continue : ApplyRet__EOF; case TokenListType__UNREAD: return ApplyRet__UNREAD; case TokenListType__REDO: diff --git a/specs/src/test/ProcessingTest.cc b/specs/src/test/ProcessingTest.cc index cde37e8..ef02bdf 100644 --- a/specs/src/test/ProcessingTest.cc +++ b/specs/src/test/ProcessingTest.cc @@ -191,6 +191,7 @@ PSpecString runTestOnExample(const char* _specList, const char* _example) if (!ig.readsLines()) { ig.setRegularRunAtEOF(); } + ps.setEOF(); ps.setString(nullptr); ps.setFirst(); try { @@ -954,6 +955,41 @@ int main(int argc, char** argv) spec = "PRINT 'ctxrecno()' 1 CONTEXT 1 PRINT 'ctxrecno()' NW"; VERIFY2(spec, "a\nb\nc", "1 2\n2 3\n3 4"); // TEST #247 + // EOF token should not terminate processing during runout cycle + // when bNeedRunoutCycleFromStart is set by eof() in a condition + spec = "w1 a: EOF if /eof()/ then /hello/ 1 endif"; + VERIFY2(spec, "test", "hello"); // TEST #248 + + // Same with visible pre-EOF output + spec = "a: w1 1 EOF if /eof()/ then /done/ 1 endif"; + VERIFY2(spec, "x\ny", "x\ny\ndone"); // TEST #249 + + // CONTEXT + EOF + eof(): CONTEXT changes m_ps during runout, + // but eof() should still return true and EOF token should not stop processing + spec = "w1 1 CONTEXT 1 if /!eof()/ then 1-* nw endif EOF if /eof()/ then /RUNOUT/ 1 endif"; + VERIFY2(spec, "a\nb\nc", "a b\nb c\nc\nRUNOUT"); // TEST #250 + + // @! returns the context-affected record (same as record() or 1-*) + // Without CONTEXT, @! and @@ are equivalent + spec = "PRINT '@!' 1"; + VERIFY2(spec, "alpha\nbeta\ngamma", "alpha\nbeta\ngamma"); // TEST #251 + + // With CONTEXT, @! returns the context-affected record while @@ returns the original + spec = "CONTEXT 1 PRINT '@!' 1 WRITE PRINT '@@' 1 WRITE"; + VERIFY2(spec, "alpha\nbeta\ngamma", "beta\nalpha\ngamma\nbeta\n\ngamma"); // TEST #252 + + // @! with CONTEXT -1 returns the previous record + spec = "CONTEXT -1 PRINT '@!' 1"; + VERIFY2(spec, "alpha\nbeta\ngamma", "\nalpha\nbeta"); // TEST #253 + + // cfrecord() without CONTEXT returns the same as record() + spec = "PRINT 'cfrecord()' 1"; + VERIFY2(spec, "alpha\nbeta\ngamma", "alpha\nbeta\ngamma"); // TEST #254 + + // cfrecord() with CONTEXT returns the original input record (not context-affected) + spec = "CONTEXT 1 PRINT 'cfrecord()' 1 PRINT 'record()' NW"; + VERIFY2(spec, "alpha\nbeta\ngamma", "alpha beta\nbeta gamma\ngamma"); // TEST #255 + if (errorCount) { std::cout << '\n' << errorCount << '/' << testCount << " tests failed.\n"; std::cout << "Failed tests: "; diff --git a/specs/src/utils/alu.cc b/specs/src/utils/alu.cc index 55b5596..50683a5 100644 --- a/specs/src/utils/alu.cc +++ b/specs/src/utils/alu.cc @@ -864,7 +864,9 @@ PValue AluAssnOperator::computeAppnd(PValue operand, PValue prevOp) void AluInputRecord::_serialize(std::ostream& os) const { - if (m_offset == 0) { + if (m_offset == INT_MAX) { + os << "@!"; + } else if (m_offset == 0) { os << "@@"; } else if (m_offset > 0) { os << "@+" << m_offset; @@ -875,6 +877,7 @@ void AluInputRecord::_serialize(std::ostream& os) const std::string AluInputRecord::_identify() { + if (m_offset == INT_MAX) return "@!"; if (m_offset == 0) return "@@"; if (m_offset > 0) return "@+" + std::to_string(m_offset); return "@" + std::to_string(m_offset); @@ -883,7 +886,9 @@ std::string AluInputRecord::_identify() PValue AluInputRecord::evaluate() { PSpecString ps; - if (m_offset == 0) { + if (m_offset == INT_MAX) { + ps = g_pStateQueryAgent->currRecord(); + } else if (m_offset == 0) { ps = g_pStateQueryAgent->inputRecord(); } else { MYASSERT_WITH_MSG(g_pReader != nullptr, "Rolling context requires a reader"); @@ -1306,6 +1311,16 @@ bool parseAluExpression(std::string& s, AluVec& vec) continue; } + // A special string @! representing the current record (context-affected) + if (*c=='@' && c[1]=='!') { + c+=2; + pUnit = std::make_shared(INT_MAX); + vec.push_back(pUnit); + prevUnitType = pUnit->type(); + mayBeStart = false; + continue; + } + // hash-sign followed by a number is either a counter or a persistent variable if (*c=='#') { c++; diff --git a/specs/src/utils/aluFunctions.cc b/specs/src/utils/aluFunctions.cc index 59c5fc2..e900d23 100644 --- a/specs/src/utils/aluFunctions.cc +++ b/specs/src/utils/aluFunctions.cc @@ -421,7 +421,7 @@ PValue AluFunc_ctxrecno() PValue AluFunc_eof() { - bool isRunOut = g_pStateQueryAgent->isRunOut(); + bool isRunOut = g_pStateQueryAgent->isEOF(); return mkValue(ALUInt(isRunOut ? 1 : 0)); } @@ -500,6 +500,16 @@ PValue AluFunc_record() return AluFunc_range(1,-1); } +PValue AluFunc_cfrecord() +{ + PSpecString ps = g_pStateQueryAgent->inputRecord(); + if (ps) { + return mkValue(ps->data()); + } else { + return mkValue(""); + } +} + PValue AluFunc_range(PValue pStart, PValue pEnd) { ALUInt start = ARG_INT_WITH_DEFAULT(pStart,1); diff --git a/specs/src/utils/aluFunctions.h b/specs/src/utils/aluFunctions.h index 4021fc2..f427f28 100644 --- a/specs/src/utils/aluFunctions.h +++ b/specs/src/utils/aluFunctions.h @@ -50,7 +50,9 @@ "(fid) - Returns TRUE (1) if the break for field-identifier 'fid' is established, or FALSE (0) otherwise.","") \ H(Record Functions,16) \ X(record, 0, ALUFUNC_REGULAR, true, \ - "() - Returns the entire record.","Equivalent to the @@ pseudo-variable.") \ + "() - Returns the entire record.","Equivalent to the @@ pseudo-variable when CONTEXT is not in effect.") \ + X(cfrecord, 0, ALUFUNC_REGULAR, true, \ + "() - Returns the entire input record, disregarding rolling context.","Equivalent to the @@ pseudo-variable. Same as record() when CONTEXT is not in effect.") \ X(length, 1, ALUFUNC_REGULAR, false, \ "(s) - Returns the length of the string s","") \ X(wordcount, 2, ALUFUNC_REGULAR, false, \ @@ -432,6 +434,7 @@ class stateQueryAgent { virtual PSpecString inputRecord() = 0; virtual bool isRunIn() = 0; virtual bool isRunOut() = 0; + virtual bool isEOF() = 0; virtual ALUInt getRecordCount() = 0; virtual ALUInt getContextOffset() = 0; virtual ALUInt getIterationCount() = 0; diff --git a/specs/tests/valgrind_unit_tests.py b/specs/tests/valgrind_unit_tests.py index c5329e2..37239af 100644 --- a/specs/tests/valgrind_unit_tests.py +++ b/specs/tests/valgrind_unit_tests.py @@ -1,7 +1,7 @@ import sys, memcheck, argparse count_ALU_tests = 832 -count_processing_tests = 247 +count_processing_tests = 255 count_token_tests = 17 # Parse the one command line options From bd4e03e9599cf6a7b991b062243213f5c142c482 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Thu, 28 May 2026 14:02:11 +0300 Subject: [PATCH 25/50] Issue #396 - harmonize record retrieval in run-out cycle (#398) - Also, explained forced runout in the docs - Also, aligned docs on "run-out" instead of "runout". --- manpage | 31 ++++++++++++----- specs/docs/onepage.md | 9 +++-- specs/docs/struct.md | 17 +++++++-- specs/src/gdb/specs_gdb.py | 46 ++++++++++++++++++++++++- specs/src/processing/ProcessingState.cc | 6 ++-- specs/src/test/ProcessingTest.cc | 13 +++++++ specs/tests/valgrind_unit_tests.py | 2 +- 7 files changed, 106 insertions(+), 18 deletions(-) diff --git a/manpage b/manpage index e4e339e..491b163 100644 --- a/manpage +++ b/manpage @@ -574,7 +574,7 @@ sorted by department name. You can print this out without repeating the departme BREAK c ID c 1 -.SS "RunIn and RunOut Cycles" +.SS "Run-In and Run-Out Cycles" A .B cycle is a single run of the specification on the current active input record. A cycle may read additional input records, produce zero output records, or produce multiple output records. If the specification contains @@ -584,16 +584,16 @@ or tokens, a single cycle can consume more than one input record. The -.B runin -cycle is the first cycle. In the runin cycle, the function +.B run-in +cycle is the first cycle. In the run-in cycle, the function .B first() returns 1. This can be used for initial processing such as printing of headers or setting initial values. The -.B runout +.B run-out cycle happens .I after -the last line has been read, but only when the specification requires a runout cycle. It consists of the spec items that follow the +the last line has been read, but only when the specification requires a run-out cycle. It consists of the spec items that follow the .B EOF token, or (when .I select second @@ -612,6 +612,21 @@ function. Example: /==========/ 1 write /Total:/ 1 print #0 nw + + +Note that any use of the +.B eof() +function anywhere in the specification forces a run-out cycle. This is called a +.I forced run-out cycle +and any use of record-reading spec units, such as a +.I data field +or +.B ALU +functions that access records, like +.B record() +or +.B word(i) +will result in an empty string returned. .SS "Input Streams" The keyword @@ -913,12 +928,12 @@ Returns a binary representation of the unsigned integer in x. The field length i Returns the length of the argument when viewed as a string. For example, len(37) is 2; len('hello') is 5. .IP "first()" 3 Returns 1 during the -.B runin +.B run-in cycle, and zero otherwise. .IP "eof()" 3 Returns 1 during the -.B runout -cycle, and zero otherwise. +.B run-out +cycle, and zero otherwise. Its use in a specification forces a run-out cycle. .IP "number()" 3 Returns the number of times the specification has so far been run on different records. .IP "recno()" 3 diff --git a/specs/docs/onepage.md b/specs/docs/onepage.md index 1c27295..0ee0ae1 100644 --- a/specs/docs/onepage.md +++ b/specs/docs/onepage.md @@ -150,13 +150,13 @@ Without **while-guard** this specification will loop forever. To solve this, **s **While-Guard** is not perfect. To disable it, you can use the command-line switch `--no-while-guard` or you can override the maximum iteration count at which the program exist by setting the `while-guard-limit` to some integer value. -RunIn and RunOut Cycles +Run-In and Run-Out Cycles ========================= A **cycle** is defined as a single run of the specification, which includes reading an input record, processing it, and outputting one or more records. If the specification contains **read** or **readstop** tokens, a single cycle can consume more than one input records. -The **runin** cycle is the first one to run. In the runin cycle, the function **first()** returns 1. This can be used for initial processing such as printing of headers or setting initial values. +The **run-in** cycle is the first one to run. In the run-in cycle, the function **first()** returns 1. This can be used for initial processing such as printing of headers or setting initial values. -The **runout** cycle happens *after* the last line has been read. It consists of the spec items that follow the **EOF** token, or (when **select second** is used) conditional specifications with the **eof()** function. Example: +The **run-out** cycle happens *after* the last line has been read. It consists of the spec items that follow the **EOF** token, or (when **select second** is used) conditional specifications with the **eof()** function. Example: ``` if first() then /Item/ 1 /Square/ nw write @@ -170,6 +170,9 @@ The **runout** cycle happens *after* the last line has been read. It consists o /Total:/ 1 print #0 nw ``` + +Note that there are two kinds of **run-out** cycle: the **explicit run-out cycle**, where there are spec units after an **EOF** token, and the **forced run-out cycle** which is triggered by the use of the `eof()` function anywhere in the specification. A forced run-out cycle will force an extra run of the specification with apparently an empty input record. + Configuration File ================== diff --git a/specs/docs/struct.md b/specs/docs/struct.md index 56ca27d..e14d44e 100644 --- a/specs/docs/struct.md +++ b/specs/docs/struct.md @@ -107,7 +107,7 @@ specs if "first()" then ### Run-Out The Run-Out cycle runs *after* the last record is processed. It is only run if it has something to do. There are two ways to do things on the run-out cycles: -1. Using the boolean function `eof()`. +1. Using the boolean function `eof()`. We call this a **forced run-out cycle** 2. Using the `EOF` keyword. The following enhancement of the run-in example will demonstrate both: @@ -138,7 +138,7 @@ specs set #0+=a eof /Total:/ 1 - print #0 Next + print #0 NEXTWORD ``` | Input | Output | | ----- | ------ | @@ -148,6 +148,19 @@ specs | 4 | 4 | | | Total: 10 | +Here's an alternate implementation with a **forced run-out cycle**: +``` +# Summing +specs + IF "eof()" THEN + /Total:/ 1 + print #0 NEXTWORD + ELSE + a: WORD 1 1 + SET #0+=a + ENDIF +``` + ### Control Breaks **Field identifiers** can be used for conditional execution when their value changes from record to record. Consider the following example CSV file containing personnel records: ``` diff --git a/specs/src/gdb/specs_gdb.py b/specs/src/gdb/specs_gdb.py index 20f7b81..3a1071e 100644 --- a/specs/src/gdb/specs_gdb.py +++ b/specs/src/gdb/specs_gdb.py @@ -830,6 +830,7 @@ def invoke(self, arg, from_tty): print(f"DataField:") val = gdb.parse_and_eval(arg) label = chr(int(val["m_label"])) if int(val["m_label"]) > 0 else "none" + tail_label = chr(int(val["m_tailLabel"])) if int(val["m_tailLabel"]) > 0 else "none" out_start = int(val["m_outStart"]) max_len = int(val["m_maxLength"]) strip = bool(val["m_strip"]) @@ -839,14 +840,57 @@ def invoke(self, arg, from_tty): conv_str = STRING_CONVERSIONS.get(conv, f"Unknown({conv})") align_str = OUTPUT_ALIGNMENT.get(align, f"Unknown({align})") - print(f" Label: {label}") + print(f" Label: {label}/{tail_label}") print(f" Output Start: {out_start}") print(f" Max Length: {max_len}") print(f" Strip: {strip}") print(f" Conversion: {conv_str}") print(f" Alignment: {align_str}") + + # m_InputPart (shared_ptr) + try: + ip_ptr = val["m_InputPart"]["_M_ptr"] + if int(ip_ptr) == 0: + print(f" Input Part: ") + else: + result = gdb.parse_and_eval( + f'((InputPart*)({int(ip_ptr)}))->Debug()') + debug_str = std_string_to_str(result) + print(f" Input Part: {debug_str}") + except Exception as ex: + print(f" Input Part: ") + + # AluVec fields + self._dump_alu_vec_field(val, "m_outputStartExpression", "Output Start Expression") + self._dump_alu_vec_field(val, "m_outputWidthExpression", "Output Width Expression") + self._dump_alu_vec_field(val, "m_outputAlignmentExpression", "Output Alignment Expression") except Exception as e: print(f"Error: {e}") + + def _dump_alu_vec_field(self, val, field_name, label): + """Print an AluVec field on one line, or 'empty' if it has no elements.""" + try: + vec = val[field_name] + size = std_vector_size(vec) + if size == 0: + print(f" {label}: empty") + else: + start = vec["_M_impl"]["_M_start"] + items = [] + for i in range(size): + try: + ptr = start[i]["_M_ptr"] + if int(ptr) != 0: + result = gdb.parse_and_eval( + f'((AluUnit*)({int(ptr)}))->_identify()') + items.append(std_string_to_str(result)) + else: + items.append("") + except: + items.append("?") + print(f" {label}: {'; '.join(items)}") + except Exception as ex: + print(f" {label}: ") class DumpTokenItem(gdb.Command): """Dump a TokenItem.""" diff --git a/specs/src/processing/ProcessingState.cc b/specs/src/processing/ProcessingState.cc index 61c4e5d..02cdbee 100644 --- a/specs/src/processing/ProcessingState.cc +++ b/specs/src/processing/ProcessingState.cc @@ -357,14 +357,14 @@ int ProcessingState::getWordEnd(int idx) { return m_wordEnd[idx-1]; } -// Convention: returns NULL for an empty string // Convention: from=0 means from the start (same as 1) // Convention: to=0 means to the end // Convention: from=0 and to=0 -- empty string. PSpecString ProcessingState::getFromTo(int from, int to) { - if (m_inputStation != STATION_SECOND) { - MYASSERT_WITH_MSG(nullptr!=m_ps,"Tried to read record in run-out cycle"); + // In the run-out cycle, return an empty string + if (m_inputStation != STATION_SECOND && nullptr==m_ps) { + return std::make_shared(); } int slen = (int)(currRecord()->length()); diff --git a/specs/src/test/ProcessingTest.cc b/specs/src/test/ProcessingTest.cc index ef02bdf..8a2098c 100644 --- a/specs/src/test/ProcessingTest.cc +++ b/specs/src/test/ProcessingTest.cc @@ -990,6 +990,19 @@ int main(int argc, char** argv) spec = "CONTEXT 1 PRINT 'cfrecord()' 1 PRINT 'record()' NW"; VERIFY2(spec, "alpha\nbeta\ngamma", "alpha beta\nbeta gamma\ngamma"); // TEST #255 + // record(), word(), field(), range() return empty string during forced run-out cycle + spec = "PRINT 'record()' 1 PRINT 'eof()' NEXTWORD"; + VERIFY2(spec, "hello\nworld", "hello 0\nworld 0\n1"); // TEST #256 + + spec = "PRINT 'word(1)' 1 PRINT 'eof()' NEXTWORD"; + VERIFY2(spec, "hello world", "hello 0\n1"); // TEST #257 + + spec = "PRINT 'field(1)' 1 PRINT 'eof()' NEXTWORD"; + VERIFY2(spec, "hello\tworld", "hello 0\n1"); // TEST #258 + + spec = "PRINT 'range(1,3)' 1 PRINT 'eof()' NEXTWORD"; + VERIFY2(spec, "abcdef", "abc 0\n1"); // TEST #259 + if (errorCount) { std::cout << '\n' << errorCount << '/' << testCount << " tests failed.\n"; std::cout << "Failed tests: "; diff --git a/specs/tests/valgrind_unit_tests.py b/specs/tests/valgrind_unit_tests.py index 37239af..8ad137d 100644 --- a/specs/tests/valgrind_unit_tests.py +++ b/specs/tests/valgrind_unit_tests.py @@ -1,7 +1,7 @@ import sys, memcheck, argparse count_ALU_tests = 832 -count_processing_tests = 255 +count_processing_tests = 259 count_token_tests = 17 # Parse the one command line options From 7f77310d672d9f419007faf5a73ef0c7acc960f4 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Thu, 28 May 2026 14:53:36 +0300 Subject: [PATCH 26/50] Issue #397 - Determine full-path vs bare name in specFile (#399) --- specs/src/cli/splitter.cc | 19 ++++++++++++++++--- specs/tests/memcheck.py | 2 +- specs/tests/recfm_tests.py | 2 +- 3 files changed, 18 insertions(+), 5 deletions(-) diff --git a/specs/src/cli/splitter.cc b/specs/src/cli/splitter.cc index 93275a8..2f11386 100644 --- a/specs/src/cli/splitter.cc +++ b/specs/src/cli/splitter.cc @@ -169,12 +169,25 @@ std::string removeComment(std::string& st) return ret; } +static bool hasPathComponent(const std::string& name) +{ + if (name.find('/') != std::string::npos) return true; +#ifdef WIN64 + if (name.find('\\') != std::string::npos) return true; + if (name.size() >= 2 && std::isalpha(name[0]) && name[1] == ':') return true; +#endif + return false; +} + static void openSpecFile(std::ifstream& theFile, std::string& fileName) { - theFile.open(fileName); - if (theFile.is_open()) return; + if (hasPathComponent(fileName)) { + // Explicit path -- open directly, don't search the spec path + theFile.open(fileName); + return; + } - // No? Try the path + // Bare name -- search the spec path char* spath = strdup(getFullSpecPath()); if (spath && spath[0]) { char* spath_ctx = spath; diff --git a/specs/tests/memcheck.py b/specs/tests/memcheck.py index c751dcd..aa6b682 100644 --- a/specs/tests/memcheck.py +++ b/specs/tests/memcheck.py @@ -78,7 +78,7 @@ def leak_check_specs(spec, inp, testid, confFile, inp2=None): outfile = "theout."+str(testid) conffile = "theconf."+str(testid) else: - specfile = "thespec" + specfile = "./thespec" inpfile = "theinp" inp2file = "theinp2" outfile = "theout" diff --git a/specs/tests/recfm_tests.py b/specs/tests/recfm_tests.py index 9fc0614..e3d5551 100644 --- a/specs/tests/recfm_tests.py +++ b/specs/tests/recfm_tests.py @@ -34,7 +34,7 @@ def run_test(description, spec, inp, expected, extra_flags="", if tests_to_run is not None and str(case_counter) not in tests_to_run: return - specfile = "recfm_spec" + specfile = "./recfm_spec" inpfile = "recfm_inp" outfile = "recfm_out" errfile = "recfm_err" From f82d6c002e1deb0b93980e756db8e749e02fe0c1 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Sun, 31 May 2026 13:48:32 +0300 Subject: [PATCH 27/50] New comprehensive CONTEXT unit test (#400) * New comprehensive CONTEXT unit test * Also increased maximum output of ProcessingTest to 1000 lines * Further beautify output of ProcessingTest --- specs/src/processing/Writer.h | 2 +- specs/src/test/ProcessingTest.cc | 141 +++++++++++++++++++++++++++-- specs/src/utils/StringQueue.h | 23 ++++- specs/src/utils/TimeUtils.cc | 7 +- specs/src/utils/TimeUtils.h | 2 + specs/tests/valgrind_unit_tests.py | 2 +- 6 files changed, 159 insertions(+), 18 deletions(-) diff --git a/specs/src/processing/Writer.h b/specs/src/processing/Writer.h index a517428..f5529a4 100644 --- a/specs/src/processing/Writer.h +++ b/specs/src/processing/Writer.h @@ -58,7 +58,7 @@ typedef std::shared_ptr PSimpleWriter; // Only used by ProcessingTest class StringWriter : public Writer { public: - StringWriter() {} + StringWriter() { m_queue.setCapacity(1000); } ~StringWriter() override {} void WriteOut() override {} void WriteOutDo(PSpecString ps, classifyingTimer& tmr) override diff --git a/specs/src/test/ProcessingTest.cc b/specs/src/test/ProcessingTest.cc index 8a2098c..0d0bef0 100644 --- a/specs/src/test/ProcessingTest.cc +++ b/specs/src/test/ProcessingTest.cc @@ -15,10 +15,40 @@ extern char g_printonly_rule; extern bool g_keep_suppressed_record; extern unsigned int g_WhileGuardLimit; -std::string prettify(std::string src) +// DUMP_TO_FILES levels: +// 0 - Never dump to files +// 1 - Dump to files only for failed tests (anything but "Res") +// 2 - Always dump to files - later tests will overwrite previous tests + +#define DUMP_TO_FILES 1 + +std::string prettify(std::string label, std::string src) { - std::string ret; +#ifdef DUMP_TO_FILES +#if DUMP_TO_FILES > 0 +#if DUMP_TO_FILES == 1 + if ("Res" != label) { +#endif + std::string fname = std::string("ProcessingTest.")+label; + auto f = fopen(fname.c_str(), "w"); + fprintf(f, "%s", src.c_str()); + fclose (f); +#if DUMP_TO_FILES == 1 + } +#endif +#endif +#endif + std::string ret(label); + int max_chars = (label == "Res") ? 160 : 256; + ret += ": <"; for (char c : src) { + if (max_chars <= 0) { + if (max_chars == 0) { + ret += "..."; + max_chars--; + } + continue; + } switch (c) { case '\n': ret.append("\\n"); @@ -31,7 +61,9 @@ std::string prettify(std::string src) default: ret+=c; } + max_chars--; } + ret += ">"; return ret; } @@ -41,16 +73,16 @@ std::string prettify(std::string src) PSpecString ps = runTestOnExample(sp, "The quick brown fox jumped over the lazy dog"); \ std::cout << "Test #" << std::setfill('0') << std::setw(3) << testCount << " "; \ if (!ps) { \ - std::cout << "*** NOT OK ***: Got (NULL); Expected: <" << ex << ">\n"; \ + std::cout << "*** NOT OK ***\n\tGot: (NULL)\n\tExpected: <" << ex << ">\n"; \ errorCount++; \ failedTests.push_back(testCount); \ } else { \ if (*(ps) != std::string(ex)) { \ - std::cout << "*** NOT OK ***:\n\tGot <" << prettify(*ps) << ">\n\tExp <" << prettify(ex) << ">\n"; \ + std::cout << "*** NOT OK ***\n\t" << prettify("Got", *ps) << "\n\t" << prettify("Expected", ex) << "\n"; \ errorCount++; \ failedTests.push_back(testCount); \ } else { \ - std::cout << "***** OK *****: <" << prettify(ex) << ">\n"; \ + std::cout << "***** OK ***** " << prettify("Res", ex) << "\n"; \ } \ } \ } while (0); @@ -61,16 +93,16 @@ std::string prettify(std::string src) PSpecString ps = runTestOnExample(sp, ln); \ std::cout << "Test #" << std::setfill('0') << std::setw(3) << testCount << " "; \ if (!ps) { \ - std::cout << "*** NOT OK ***: Got (NULL); Expected: <" << prettify(ex) << ">\n"; \ + std::cout << "*** NOT OK ***\n\tGot: (NULL)\n\t" << prettify("Expected", ex) << "\n"; \ errorCount++; \ failedTests.push_back(testCount); \ } else { \ if (*(ps) != std::string(ex)) { \ - std::cout << "*** NOT OK ***:\n\tGot <" << prettify(*ps) << ">\n\tExp <" << prettify(ex) << ">\n"; \ + std::cout << "*** NOT OK ***\n\t" << prettify("Got", *ps) << "\n\t" << prettify("Expected", ex) << "\n"; \ errorCount++; \ failedTests.push_back(testCount); \ } else { \ - std::cout << "***** OK *****: <" << prettify(ex) << ">\n"; \ + std::cout << "***** OK ***** " << prettify("Res", ex) << "\n"; \ } \ } \ } while (0); @@ -85,11 +117,11 @@ std::string prettify(std::string src) actual_res = e.what(true); \ } \ if (res==actual_res) { \ - std::cout << "***** OK *****: <" << prettify(actual_res) << ">\n"; \ + std::cout << "***** OK ***** " << prettify("Res", actual_res) << "\n"; \ } else { \ errorCount++; \ failedTests.push_back(testCount); \ - std::cout << "*** NOT OK ***:\n\tGot <" << prettify(actual_res) << ">\n\tExp <" << prettify(res) << ">\n"; \ + std::cout << "*** NOT OK ***\n\t" << prettify("Got", actual_res) << "\n\t" << prettify("Expected", res) << ">\n"; \ } \ } while (0); @@ -1002,6 +1034,95 @@ int main(int argc, char** argv) spec = "PRINT 'range(1,3)' 1 PRINT 'eof()' NEXTWORD"; VERIFY2(spec, "abcdef", "abc 0\n1"); // TEST #259 + + // All the ways of accessing a record with and without CONTEXT + spec = "'Cycle:' 1 PRINT 'recno()' WRITE " \ + " 'Context:' 3 PRINT 'ctxrecno()' WRITE" \ + " 'Using Spec Units:' 5 1-* 25 WRITE" \ + " 'Using record():' 5 PRINT 'record()' 25 WRITE" \ + " 'Using @@:' 5 PRINT '@@' 25 WRITE" \ + " 'Using @!:' 5 PRINT '@!' 25 WRITE" \ + " 'Using cfrecord():' 5 PRINT 'cfrecord()' 25 WRITE" \ + " 'Setting CONTEXT to +1' 3 CONTEXT +1 WRITE" \ + " 'Context:' 3 PRINT 'ctxrecno()' WRITE" \ + " 'Using Spec Units:' 5 1-* 25 WRITE" \ + " 'Using record():' 5 PRINT 'record()' 25 WRITE" \ + " 'Using @@:' 5 PRINT '@@' 25 WRITE" \ + " 'Using @!:' 5 PRINT '@!' 25 WRITE" \ + " 'Using cfrecord():' 5 PRINT 'cfrecord()' 25 WRITE" \ + " 'Setting CONTEXT to -1' 3 CONTEXT -1 WRITE" \ + " 'Context:' 3 PRINT 'ctxrecno()' WRITE" \ + " 'Using Spec Units:' 5 1-* 25 WRITE" \ + " 'Using record():' 5 PRINT 'record()' 25 WRITE" \ + " 'Using @@:' 5 PRINT '@@' 25 WRITE" \ + " 'Using @!:' 5 PRINT '@!' 25 WRITE" \ + " 'Using cfrecord():' 5 PRINT 'cfrecord()' 25 WRITE"; + strm = "Wise men say\nOnly fools rush in\nBut I can't help falling in love with you"; + res = \ + "Cycle: 1\n" \ + " Context: 1\n" \ + " Using Spec Units: Wise men say\n" \ + " Using record(): Wise men say\n" \ + " Using @@: Wise men say\n" \ + " Using @!: Wise men say\n" \ + " Using cfrecord(): Wise men say\n" \ + " Setting CONTEXT to +1\n" \ + " Context: 2\n" \ + " Using Spec Units: Only fools rush in\n" \ + " Using record(): Only fools rush in\n" \ + " Using @@: Wise men say\n" \ + " Using @!: Only fools rush in\n" \ + " Using cfrecord(): Wise men say\n" \ + " Setting CONTEXT to -1\n" \ + " Context: 0\n" \ + " Using Spec Units: \n" \ + " Using record(): \n" \ + " Using @@: Wise men say\n" \ + " Using @!: \n" \ + " Using cfrecord(): Wise men say\n" \ + "Cycle: 2\n" \ + " Context: 2\n" \ + " Using Spec Units: Only fools rush in\n" \ + " Using record(): Only fools rush in\n" \ + " Using @@: Only fools rush in\n" \ + " Using @!: Only fools rush in\n" \ + " Using cfrecord(): Only fools rush in\n" \ + " Setting CONTEXT to +1\n" \ + " Context: 3\n" \ + " Using Spec Units: But I can't help falling in love with you\n" \ + " Using record(): But I can't help falling in love with you\n" \ + " Using @@: Only fools rush in\n" \ + " Using @!: But I can't help falling in love with you\n" \ + " Using cfrecord(): Only fools rush in\n" \ + " Setting CONTEXT to -1\n" \ + " Context: 1\n" \ + " Using Spec Units: Wise men say\n" \ + " Using record(): Wise men say\n" \ + " Using @@: Only fools rush in\n" \ + " Using @!: Wise men say\n" \ + " Using cfrecord(): Only fools rush in\n" \ + "Cycle: 3\n" \ + " Context: 3\n" \ + " Using Spec Units: But I can't help falling in love with you\n" \ + " Using record(): But I can't help falling in love with you\n" \ + " Using @@: But I can't help falling in love with you\n" \ + " Using @!: But I can't help falling in love with you\n" \ + " Using cfrecord(): But I can't help falling in love with you\n" \ + " Setting CONTEXT to +1\n" \ + " Context: 4\n" \ + " Using Spec Units: \n" \ + " Using record(): \n" \ + " Using @@: But I can't help falling in love with you\n" \ + " Using @!: \n" \ + " Using cfrecord(): But I can't help falling in love with you\n" \ + " Setting CONTEXT to -1\n" \ + " Context: 2\n" \ + " Using Spec Units: Only fools rush in\n" \ + " Using record(): Only fools rush in\n" \ + " Using @@: But I can't help falling in love with you\n" \ + " Using @!: Only fools rush in\n" \ + " Using cfrecord(): But I can't help falling in love with you"; + VERIFY2(spec, strm.c_str(), res.c_str()); // TEST #260 if (errorCount) { std::cout << '\n' << errorCount << '/' << testCount << " tests failed.\n"; diff --git a/specs/src/utils/StringQueue.h b/specs/src/utils/StringQueue.h index c268d7a..c3aa615 100644 --- a/specs/src/utils/StringQueue.h +++ b/specs/src/utils/StringQueue.h @@ -21,13 +21,24 @@ template class MTQueue std::condition_variable cv_QueueFull; queueTimer m_timer; bool m_Done; + size_t m_highWaterMark; + size_t m_lowWaterMark; + static size_t computeLowWaterMark(size_t hwm) { + return (hwm > 10) ? (hwm - hwm / 10) : (hwm > 2 ? hwm - 2 : 0); + } public: - MTQueue() : m_Done(false) {} + MTQueue(size_t highWaterMark = QUEUE_HIGH_WM) + : m_Done(false) + , m_highWaterMark(highWaterMark) + , m_lowWaterMark(computeLowWaterMark(highWaterMark)) + { + m_timer.setCapacity(m_highWaterMark); + } void push(T const& data) { MYASSERT(data!=nullptr); uniqueLock lock(m_Mutex); - while (m_Queue.size()>=QUEUE_HIGH_WM) { + while (m_Queue.size()>=m_highWaterMark) { cv_QueueFull.wait(lock); } m_Queue.push(data); @@ -57,7 +68,7 @@ template class MTQueue size_t queueSize = m_Queue.size(); m_timer.decrement(); lock.unlock(); - if (queueSize < QUEUE_LOW_WM) { + if (queueSize < m_lowWaterMark) { cv_QueueFull.notify_one(); } return true; @@ -69,6 +80,12 @@ template class MTQueue cv_QueueEmpty.notify_one(); } + void setCapacity(size_t highWaterMark) { + m_highWaterMark = highWaterMark; + m_lowWaterMark = computeLowWaterMark(highWaterMark); + m_timer.setCapacity(highWaterMark); + } + void DumpStats(std::string title) { m_timer.dump(title); } diff --git a/specs/src/utils/TimeUtils.cc b/specs/src/utils/TimeUtils.cc index 6b1ef94..b7fac17 100644 --- a/specs/src/utils/TimeUtils.cc +++ b/specs/src/utils/TimeUtils.cc @@ -216,6 +216,7 @@ void classifyingTimer::dump(std::string title) queueTimer::queueTimer() { m_lastIncDec = m_lastTimePoint = HClock::now(); + m_capacity = QUEUE_HIGH_WM; m_elements = 0; m_ns_elems = 0; m_currentClass = queueTimeClassEmpty; @@ -259,7 +260,7 @@ void queueTimer::dump(std::string title) // average double averageFill = double(m_ns_elems) / double(totalDuration); - oss << "\tAverage: " << averageFill << " (capacity = " << QUEUE_HIGH_WM << ")\n"; + oss << "\tAverage: " << averageFill << " (capacity = " << m_capacity << ")\n"; std::cerr << oss.str(); } @@ -269,7 +270,7 @@ void queueTimer::increment() auto now = HClock::now(); if (m_elements == 1 && m_currentClass == queueTimeClassEmpty) { changeClass(queueTimeClassOther, now); - } else if (m_elements == (QUEUE_HIGH_WM-1) && m_currentClass == queueTimeClassOther) { + } else if (m_elements == (m_capacity-1) && m_currentClass == queueTimeClassOther) { changeClass(queueTimeClassFull, now); } @@ -286,7 +287,7 @@ void queueTimer::decrement() MYASSERT(m_elements > 0); if (m_elements == 1 && m_currentClass == queueTimeClassOther) { changeClass(queueTimeClassEmpty, now); - } else if (m_elements == (QUEUE_HIGH_WM-1) && m_currentClass == queueTimeClassFull) { + } else if (m_elements == (m_capacity-1) && m_currentClass == queueTimeClassFull) { changeClass(queueTimeClassOther, now); } diff --git a/specs/src/utils/TimeUtils.h b/specs/src/utils/TimeUtils.h index e1b796c..b64dd71 100644 --- a/specs/src/utils/TimeUtils.h +++ b/specs/src/utils/TimeUtils.h @@ -64,11 +64,13 @@ enum queueTimeClasses { class queueTimer { public: queueTimer(); + void setCapacity(size_t capacity) { m_capacity = capacity; } void increment(); void decrement(); void drain(); void dump(std::string title); private: + size_t m_capacity; size_t m_elements; std::chrono::time_point m_lastTimePoint; std::chrono::time_point m_lastIncDec; diff --git a/specs/tests/valgrind_unit_tests.py b/specs/tests/valgrind_unit_tests.py index 8ad137d..acd636c 100644 --- a/specs/tests/valgrind_unit_tests.py +++ b/specs/tests/valgrind_unit_tests.py @@ -1,7 +1,7 @@ import sys, memcheck, argparse count_ALU_tests = 832 -count_processing_tests = 259 +count_processing_tests = 260 count_token_tests = 17 # Parse the one command line options From 10aa1ad865d7d7751103e8bf9282b53c15e1ed6e Mon Sep 17 00:00:00 2001 From: niry1 Date: Mon, 1 Jun 2026 15:48:28 +0300 Subject: [PATCH 28/50] Issue #403 - raise the run-dry flag when READ... or READSTOP try to read beyond the input --- specs/src/processing/Reader.cc | 6 +++++- specs/tests/valgrind_specs.py | 8 ++++---- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/specs/src/processing/Reader.cc b/specs/src/processing/Reader.cc index 4688165..9164ed5 100644 --- a/specs/src/processing/Reader.cc +++ b/specs/src/processing/Reader.cc @@ -62,7 +62,11 @@ PSpecString Reader::get(classifyingTimer& tmr, unsigned int& _readerCounter) tmr.changeClass(timeClassIO); ret = getNextRecord(); tmr.changeClass(timeClassProcessing); - if (!ret) _readerCounter--; + if (!ret) { + MYASSERT(_readerCounter>0); + _readerCounter--; + m_bRanDry = true; + } else { m_countRead++; m_countUsed++; diff --git a/specs/tests/valgrind_specs.py b/specs/tests/valgrind_specs.py index ff8645c..74fd47b 100644 --- a/specs/tests/valgrind_specs.py +++ b/specs/tests/valgrind_specs.py @@ -337,11 +337,11 @@ def run_case(spec, input, description, expected_rc=memcheck.RetCode_SUCCESS, con s = "print 'number()' 1 print 'recno()' nw READ print 'number()' nw print 'recno()' nw" i = "1\n2\n3\n4\n5\n6\n7\n8\n9" -run_case(s,i,"Functions: number & recno (2)",memcheck.RetCode_COMMAND_FAILED) +run_case(s,i,"Functions: number & recno (2)") s = "print 'number()' 1 print 'recno()' nw READSTOP print 'number()' nw print 'recno()' nw" i = "1\n2\n3\n4\n5\n6\n7\n8\n9" -run_case(s,i,"Functions: number & recno (3)",memcheck.RetCode_COMMAND_FAILED) +run_case(s,i,"Functions: number & recno (3)") s = "print 'record()' 1" i = "1\n\nhello" @@ -599,7 +599,7 @@ def run_case(spec, input, description, expected_rc=memcheck.RetCode_SUCCESS, con ENDIF """ i = input_samples.gitlog -run_case(s,i,"READ and READSTOP",memcheck.RetCode_COMMAND_FAILED) +run_case(s,i,"READ and READSTOP") # UNREAD s = \ @@ -618,7 +618,7 @@ def run_case(spec, input, description, expected_rc=memcheck.RetCode_SUCCESS, con UNREAD """ i = input_samples.gitlog -run_case(s,i,"UNREAD",memcheck.RetCode_COMMAND_FAILED) +run_case(s,i,"UNREAD") # REDO From f786cf9980496bcef207a664a070c6d4c47c4b6c Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Mon, 1 Jun 2026 16:54:17 +0300 Subject: [PATCH 29/50] Fix sword doc (#402) Co-authored-by: niry1 --- specs/src/processing/ProcessingState.h | 2 ++ specs/src/utils/aluFunctions.cc | 4 ++-- specs/src/utils/aluFunctions.h | 2 +- 3 files changed, 5 insertions(+), 3 deletions(-) diff --git a/specs/src/processing/ProcessingState.h b/specs/src/processing/ProcessingState.h index 1d84061..931690a 100644 --- a/specs/src/processing/ProcessingState.h +++ b/specs/src/processing/ProcessingState.h @@ -12,6 +12,8 @@ #define LOCAL_WHITESPACE "" #define DEFAULT_WORDSEPARATOR " " #define DEFAULT_FIELDSEPARATOR "\t" +#define DEFAULT_WORDSEPARATOR_C ' ' +#define DEFAULT_FIELDSEPARATOR_C '\t' #define STATION_FIRST -1 #define STATION_SECOND -2 diff --git a/specs/src/utils/aluFunctions.cc b/specs/src/utils/aluFunctions.cc index e900d23..e689be3 100644 --- a/specs/src/utils/aluFunctions.cc +++ b/specs/src/utils/aluFunctions.cc @@ -1698,7 +1698,7 @@ PValue AluFunc_sfield(PValue pStr, PValue pCount, PValue pSep) if (pSep && pSep->getStrPtr()->length() > 0) { sep = pSep->getStr()[0]; } else { - sep = '\t'; + sep = DEFAULT_FIELDSEPARATOR_C; } if (0 == count) { @@ -1792,7 +1792,7 @@ PValue AluFunc_sword(PValue pStr, PValue pCount, PValue pSep) if (pSep && pSep->getStrPtr()->length() > 0) { sep = pSep->getStr()[0]; } else { - sep = ' '; + sep = DEFAULT_WORDSEPARATOR_C; } if (0 == count) { diff --git a/specs/src/utils/aluFunctions.h b/specs/src/utils/aluFunctions.h index f427f28..d21bdb9 100644 --- a/specs/src/utils/aluFunctions.h +++ b/specs/src/utils/aluFunctions.h @@ -144,7 +144,7 @@ X(rvalue, 2, ALUFUNC_REGULAR, false, \ "(str,[sep]) - Return the right hand part of 'str' separated by 'sep'.","'sep' defaults to an equals sign.") \ X(sword, 3, ALUFUNC_REGULAR, false, \ - "(str,n,[sep]) - Returns the n-th word of 'str' if the word separator is 'sep'.","'sep' defaults to a tab.") \ + "(str,n,[sep]) - Returns the n-th word of 'str' if the word separator is 'sep'.","'sep' defaults to a space.") \ X(abbrev, 3, ALUFUNC_REGULAR, false, \ "(str,s,[len]) - Returns TRUE (1) if 's' is a prefix of 'str', or FALSE (0) otherwise.","If 'len' is specified, only the first 'len' characters of 's' are considered.") \ X(compare, 3, ALUFUNC_REGULAR, false, \ From bc0831f284414127fcc41b1d4f4a8b23ac34d6a7 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Mon, 1 Jun 2026 16:58:24 +0300 Subject: [PATCH 30/50] iIssue #401 - Document the interactions between READSTOP and context (#405) - Also gdb improvements --- manpage | 26 ++- specs/docs/alu.md | 2 +- specs/docs/streams.md | 8 +- specs/src/gdb/specs.gdb | 23 ++- specs/src/gdb/specs_gdb.py | 377 +++++++++++++++++++++++-------------- 5 files changed, 274 insertions(+), 162 deletions(-) diff --git a/manpage b/manpage index 491b163..66fb5ef 100644 --- a/manpage +++ b/manpage @@ -228,9 +228,17 @@ Convert printable time format to a number, representing microseconds since the e .SS "Other Spec Units" There are also other spec units, that may be used: .IP "READ" 3 -Causes the program to read the next line of input. If we have already read the last line, the read line is taken to be the empty string. +Causes the program to read the next line of input. If we have already read the last line, the read line is taken to be the empty string. When a +.B READ +or +.B READSTOP +spec unit is applied, the context offset is reset to zero (the current record). .IP "READSTOP" 3 -Causes the program to read the next line of input. If we have already read the last line, no more processing is done for this iteration. +Causes the program to read the next line of input. If we have already read the last line, no more processing is done for this iteration. When a +.B READ +or +.B READSTOP +spec unit is applied, the context offset is reset to zero (the current record). .IP "UNREAD" 3 Causes the program to push back the current active record back to the reader, so that the next iteration of the specification or the next .I READ @@ -367,7 +375,11 @@ resets the working string to the current record, sets it to the next record, .B CONTEXT -1 sets it to the previous record, and so on. -The required buffer sizes are computed at parse time from the largest positive and negative offsets used. If the offset refers to a record beyond the beginning or end of the input, the working string is set to the empty string. +The required buffer sizes are computed at parse time from the largest positive and negative offsets used. If the offset refers to a record beyond the beginning or end of the input, the working string is set to the empty string. Note that reading beyond the input with +.B CONTEXT +does not cause processing to stop, even if a +.B READSTOP +token is present in the specification. .P .B CONTEXT must not be abbreviated. It is not supported with threading or multiple input streams. @@ -399,6 +411,14 @@ Example: echo -e "A\\nB\\nC" | specs print @+1 1 .P produces: "B", "C", "" -- each line shows the next record (empty for the last). +.P +Note that reading beyond the input with +.B @+n +or +.B @-n +does not cause processing to stop, even if a +.B READSTOP +token is present in the specification. .SS "MainOptions" These are optional spec units that appear at the beginning of the specification and modify the behavior of the entire specification. diff --git a/specs/docs/alu.md b/specs/docs/alu.md index f88d765..222a985 100644 --- a/specs/docs/alu.md +++ b/specs/docs/alu.md @@ -117,7 +117,7 @@ POSIX (darwin) system using the g++ compiler and Python 3.9.6 - release variatio ``` Others are `@cols`, which contains the number of columns in the terminal screen, and `@rows`, which contains the number of rows on that same screen. -Additionally, the `@@` string stands for the entire input record. When rolling context is in effect (see [Streams and Records](streams.md#rolling-context)), `@@` always refers to the original input record. The `@!` string refers to the current record as affected by `CONTEXT`, which is the same as `@@` when no `CONTEXT` is active. The `@-n` and `@+n` syntax is an alternative to using that is effective within expressions. The following three specifications are equivalent: +Additionally, the `@@` string stands for the entire input record. When rolling context is in effect (see [Streams and Records](streams.md#rolling-context)), `@@` always refers to the original input record. The `@!` string refers to the current record as affected by `CONTEXT`, which is the same as `@@` when no `CONTEXT` is active. The `@-n` and `@+n` syntax is an alternative to using that is effective within expressions. Note that reading beyond the input with `@+n` or `@-n` does not cause processing to stop, even if a `READSTOP` token is present in the specification. The following three specifications are equivalent: ``` # Using @@ syntax # Using the CONTEXT keyword # No expression - just data fields diff --git a/specs/docs/streams.md b/specs/docs/streams.md index f4ae33e..856c7ad 100644 --- a/specs/docs/streams.md +++ b/specs/docs/streams.md @@ -36,7 +36,7 @@ So why do we have data fields at all if we don't want to output them? There can ## >1 Input Records in Each Iteration Sometimes we would like to use more than one input record to produce our output record. We use the `READ` or `READSTOP` keywords for that. -Both `READ` and `READSTOP` read the next record from the input stream to be the new active input record. The difference is what to do if the current line was the last. With `READ` the specification continues to be executed as if we have just read an empty record. With `READSTOP` the execution of the specification stops. +Both `READ` and `READSTOP` read the next record from the input stream to be the new active input record. The difference is what to do if the current line was the last. With `READ` the specification continues to be executed as if we have just read an empty record. With `READSTOP` the execution of the specification stops. When a `READ` or `READSTOP` spec unit is applied, the context offset is reset to zero (the current record). Below is an example of a specification that handles git log. A git log looks something like this: ``` @@ -250,7 +250,7 @@ A few things to note: ## Rolling Context The `SELECT SECOND` mechanism described above lets us peek one record ahead. But what if we need to look further ahead, or look *behind* at records we've already seen? The `CONTEXT` spec unit provides a general way to do this. -`CONTEXT` takes a single integer argument -- a positive number to look forward, a negative number to look backward, or zero to reset to the current record. When **specs** encounters a `CONTEXT` spec unit, it changes the active input record to the one at the given offset from the current record. Any input parts that follow will read from that record instead of the current one. +`CONTEXT` takes a single integer argument -- a positive number to look forward, a negative number to look backward, or zero to reset to the current record. When **specs** encounters a `CONTEXT` spec unit, it changes the active input record to the one at the given offset from the current record. Any input parts that follow will read from that record instead of the current one. Note that reading beyond the input with `CONTEXT` does not cause processing to stop, even if a `READSTOP` token is present in the specification. Consider the following input: ``` @@ -294,12 +294,14 @@ gamma ``` The first column comes from `WORD 1` while the *next* record is selected, and the second column comes from `WORD 1` after `CONTEXT 0` resets back to the current record. +Note that when a `READ` or `READSTOP` spec unit is applied, the context offset is automatically reset to zero (the current record). This means that any context offset set by a `CONTEXT` spec unit will be lost when `READ` or `READSTOP` is executed. + ### Context in Expressions In addition to the `CONTEXT` spec unit, **specs** supports the `@+n` and `@-n` syntax in expressions, where *n* is a non-negative integer. These evaluate to the full content of the record at the given offset: ``` specs PRINT "length(@+1)" 1 ``` -Given the input `AB`, `CDE`, `F`, this outputs `3`, `1`, `0` -- the length of the *next* record in each cycle. +Given the input `AB`, `CDE`, `F`, this outputs `3`, `1`, `0` -- the length of the *next* record in each cycle. Note that reading beyond the input with `@+n` or `@-n` does not cause processing to stop, even if a `READSTOP` token is present in the specification. Note that `@@` (the current input record) and `@+0` or `@-0` are not quite the same thing when `CONTEXT` is also used: `@@` always returns the real input record, regardless of any `CONTEXT` that may be in effect. To get the context-affected record in an expression, use `@!`: ``` diff --git a/specs/src/gdb/specs.gdb b/specs/src/gdb/specs.gdb index b124feb..07468a2 100644 --- a/specs/src/gdb/specs.gdb +++ b/specs/src/gdb/specs.gdb @@ -65,39 +65,39 @@ define dump_item end define dump_data_field - dump-data-field $arg0 + dump-data-field ((DataField)($arg0)) end define dump_token_item - dump-token-item $arg0 + dump-token-item ((TokenItem)($arg0)) end define dump_set_item - dump-set-item $arg0 + dump-set-item ((SetItem)($arg0)) end define dump_skip_item - dump-skip-item $arg0 + dump-skip-item ((SkipItem)($arg0)) end define dump_condition_item - dump-condition-item $arg0 + dump-condition-item ((ConditionItem)($arg0)) end define dump_break_item - dump-break-item $arg0 + dump-break-item ((BreakItem)($arg0)) end define dump_context_item - dump-context-item $arg0 + dump-context-item ((ContextItem)($arg0)) end define dump_select_item - dump-select-item $arg0 + dump-select-item ((SelectItem)($arg0)) end define dump_split_item - dump-split-item $arg0 + dump-split-item ((SplitItem)($arg0)) end # itemGroup @@ -156,6 +156,10 @@ define dump_freq_map dump-frequency-map $arg0 end +define dump_compute_stack + dump-compute-stack $arg0 +end + # Python interface define dump_alu_function dump-alu-function $arg0 @@ -279,6 +283,7 @@ echo dump_token - Dump Token\n echo dump_alu_value - Dump ALUValue\n echo dump_alu_counters - Dump ALUCounters\n echo dump_alu_vec - Dump AluVec\n +echo dump_compute_stack - Dump compute stack (std::stack)\n echo dump_alu_function - Dump AluFunction\n echo dump_external_func_rec - Dump ExternalFunctionRec\n echo dump_python_func_collection - Dump PythonFunctionCollection\n diff --git a/specs/src/gdb/specs_gdb.py b/specs/src/gdb/specs_gdb.py index 3a1071e..1a023c5 100644 --- a/specs/src/gdb/specs_gdb.py +++ b/specs/src/gdb/specs_gdb.py @@ -763,6 +763,145 @@ def invoke(self, arg, from_tty): # DUMP COMMANDS - Item Hierarchy # ============================================================================ +# Helper functions for dumping derived Item classes +def _dump_data_field_details(val): + """Dump DataField-specific fields.""" + print(f"DataField:") + label = chr(int(val["m_label"])) if int(val["m_label"]) > 0 else "none" + tail_label = chr(int(val["m_tailLabel"])) if int(val["m_tailLabel"]) > 0 else "none" + out_start = int(val["m_outStart"]) + max_len = int(val["m_maxLength"]) + strip = bool(val["m_strip"]) + conv = int(val["m_conversion"]) + align = int(val["m_alignment"]) + + conv_str = STRING_CONVERSIONS.get(conv, f"Unknown({conv})") + align_str = OUTPUT_ALIGNMENT.get(align, f"Unknown({align})") + + print(f" Label: {label}/{tail_label}") + print(f" Output Start: {out_start}") + print(f" Max Length: {max_len}") + print(f" Strip: {strip}") + print(f" Conversion: {conv_str}") + print(f" Alignment: {align_str}") + + # m_InputPart (shared_ptr) + try: + ip_ptr = val["m_InputPart"]["_M_ptr"] + if int(ip_ptr) == 0: + print(f" Input Part: ") + else: + result = gdb.parse_and_eval( + f'((InputPart*)({int(ip_ptr)}))->Debug()') + debug_str = std_string_to_str(result) + print(f" Input Part: {debug_str}") + except Exception as ex: + print(f" Input Part: ") + + # AluVec fields + _dump_alu_vec_field(val, "m_outputStartExpression", "Output Start Expression") + _dump_alu_vec_field(val, "m_outputWidthExpression", "Output Width Expression") + _dump_alu_vec_field(val, "m_outputAlignmentExpression", "Output Alignment Expression") + +def _dump_alu_vec_field(val, field_name, label): + """Print an AluVec field on one line, or 'empty' if it has no elements.""" + try: + vec = val[field_name] + size = std_vector_size(vec) + if size == 0: + print(f" {label}: empty") + else: + start = vec["_M_impl"]["_M_start"] + items = [] + for i in range(size): + try: + ptr = start[i]["_M_ptr"] + if int(ptr) != 0: + result = gdb.parse_and_eval( + f'((AluUnit*)({int(ptr)}))->_identify()') + items.append(std_string_to_str(result)) + else: + items.append("") + except: + items.append("?") + print(f" {label}: {'; '.join(items)}") + except Exception as ex: + print(f" {label}: ") + +def _dump_token_item_details(val): + """Dump TokenItem-specific fields.""" + print(f"TokenItem:") + token = deref_shared_ptr(val["mp_Token"]) + if token: + type_val = int(token["m_type"]) + type_str = TOKEN_TYPES.get(type_val, f"Unknown({type_val})") + print(f" Token type: {type_str}") + else: + print(f" mp_Token: ") + +def _dump_set_item_details(val): + """Dump SetItem-specific fields.""" + print(f"SetItem:") + raw_expr = std_string_to_str(val["m_rawExpression"]) + key = int(val["m_key"]) + print(f" Expression: \"{raw_expr}\"") + print(f" Key: {key}") + +def _dump_skip_item_details(val): + """Dump SkipItem-specific fields.""" + print(f"SkipItem:") + raw_expr = std_string_to_str(val["m_rawExpression"]) + is_until = bool(val["m_bIsUntil"]) + satisfied = bool(val["m_bSatisfied"]) + skip_type = "SKIPUNTIL" if is_until else "SKIPWHILE" + print(f" Type: {skip_type}") + print(f" Expression: \"{raw_expr}\"") + print(f" Satisfied: {satisfied}") + +def _dump_condition_item_details(val): + """Dump ConditionItem-specific fields.""" + print(f"ConditionItem:") + pred = int(val["m_pred"]) + pred_str = CONDITION_PREDICATE.get(pred, f"Unknown({pred})") + raw_expr = std_string_to_str(val["m_rawExpression"]) + is_assn = bool(val["m_isAssignment"]) + print(f" Predicate: {pred_str}") + print(f" Expression: \"{raw_expr}\"") + print(f" Is Assignment: {is_assn}") + +def _dump_break_item_details(val): + """Dump BreakItem-specific fields.""" + print(f"BreakItem:") + ident = chr(int(val["m_identifier"])) + print(f" Identifier: {ident}") + +def _dump_context_item_details(val): + """Dump ContextItem-specific fields.""" + print(f"ContextItem:") + offset = int(val["m_offset"]) + print(f" Offset: {offset}") + +def _dump_select_item_details(val): + """Dump SelectItem-specific fields.""" + print(f"SelectItem:") + stream = int(val["m_stream"]) + b_output = bool(val["bOutput"]) + print(f" Stream: {stream}") + print(f" Output: {b_output}") + +def _dump_split_item_details(val): + """Dump SplitItem-specific fields.""" + print(f"SplitItem:") + is_field = bool(val["m_isField"]) + sep = std_string_to_str(val["m_separator"]) + splitting = bool(val["m_splitting"]) + current_piece = int(val["m_currentPiece"]) + split_type = "SPLITF" if is_field else "SPLITW" + print(f" Type: {split_type}") + print(f" Separator: \"{sep}\"") + print(f" Splitting: {splitting}") + print(f" Current Piece: {current_piece}") + class DumpItem(gdb.Command): """Dump an Item (polymorphic).""" @@ -809,6 +948,47 @@ def invoke(self, arg, from_tty): pass print("----- end of 'Item' dump") + + # Determine the dynamic type and dump derived-class-specific info. + # `val` may be a pointer (e.g. Item*), or an object/reference + # (e.g. when called with `*this`). Resolve it to the concrete + # object cast to its dynamic type so derived fields are accessible. + type_name = "Item" + derived = None + try: + obj = val + if obj.type.strip_typedefs().code == gdb.TYPE_CODE_PTR: + obj = obj.dereference() + dyn_type = obj.dynamic_type + type_name = dyn_type.name or type_name + # Cast to the dynamic type so derived-class fields can be read + derived = obj.cast(dyn_type) + except Exception: + derived = None + + # Dispatch to appropriate helper based on dynamic type + if derived is not None: + try: + if "DataField" in type_name: + _dump_data_field_details(derived) + elif "TokenItem" in type_name: + _dump_token_item_details(derived) + elif "SetItem" in type_name: + _dump_set_item_details(derived) + elif "SkipItem" in type_name: + _dump_skip_item_details(derived) + elif "ConditionItem" in type_name: + _dump_condition_item_details(derived) + elif "BreakItem" in type_name: + _dump_break_item_details(derived) + elif "ContextItem" in type_name: + _dump_context_item_details(derived) + elif "SelectItem" in type_name: + _dump_select_item_details(derived) + elif "SplitItem" in type_name: + _dump_split_item_details(derived) + except Exception as e: + print(f"Error dumping derived-class details: {e}") except Exception as e: print(f"Error: {e}") @@ -825,72 +1005,8 @@ def invoke(self, arg, from_tty): if self.dump_item is None: self.dump_item = DumpItem() self.dump_item.invoke(arg, from_tty) - - # Then print derived class fields - print(f"DataField:") - val = gdb.parse_and_eval(arg) - label = chr(int(val["m_label"])) if int(val["m_label"]) > 0 else "none" - tail_label = chr(int(val["m_tailLabel"])) if int(val["m_tailLabel"]) > 0 else "none" - out_start = int(val["m_outStart"]) - max_len = int(val["m_maxLength"]) - strip = bool(val["m_strip"]) - conv = int(val["m_conversion"]) - align = int(val["m_alignment"]) - - conv_str = STRING_CONVERSIONS.get(conv, f"Unknown({conv})") - align_str = OUTPUT_ALIGNMENT.get(align, f"Unknown({align})") - - print(f" Label: {label}/{tail_label}") - print(f" Output Start: {out_start}") - print(f" Max Length: {max_len}") - print(f" Strip: {strip}") - print(f" Conversion: {conv_str}") - print(f" Alignment: {align_str}") - - # m_InputPart (shared_ptr) - try: - ip_ptr = val["m_InputPart"]["_M_ptr"] - if int(ip_ptr) == 0: - print(f" Input Part: ") - else: - result = gdb.parse_and_eval( - f'((InputPart*)({int(ip_ptr)}))->Debug()') - debug_str = std_string_to_str(result) - print(f" Input Part: {debug_str}") - except Exception as ex: - print(f" Input Part: ") - - # AluVec fields - self._dump_alu_vec_field(val, "m_outputStartExpression", "Output Start Expression") - self._dump_alu_vec_field(val, "m_outputWidthExpression", "Output Width Expression") - self._dump_alu_vec_field(val, "m_outputAlignmentExpression", "Output Alignment Expression") except Exception as e: print(f"Error: {e}") - - def _dump_alu_vec_field(self, val, field_name, label): - """Print an AluVec field on one line, or 'empty' if it has no elements.""" - try: - vec = val[field_name] - size = std_vector_size(vec) - if size == 0: - print(f" {label}: empty") - else: - start = vec["_M_impl"]["_M_start"] - items = [] - for i in range(size): - try: - ptr = start[i]["_M_ptr"] - if int(ptr) != 0: - result = gdb.parse_and_eval( - f'((AluUnit*)({int(ptr)}))->_identify()') - items.append(std_string_to_str(result)) - else: - items.append("") - except: - items.append("?") - print(f" {label}: {'; '.join(items)}") - except Exception as ex: - print(f" {label}: ") class DumpTokenItem(gdb.Command): """Dump a TokenItem.""" @@ -901,21 +1017,10 @@ def __init__(self): def invoke(self, arg, from_tty): try: - # First, call DumpItem to print base class fields + # Call DumpItem which now handles derived-class details if self.dump_item is None: self.dump_item = DumpItem() self.dump_item.invoke(arg, from_tty) - - # Then print derived class fields - print(f"TokenItem:") - val = gdb.parse_and_eval(arg) - token = deref_shared_ptr(val["mp_Token"]) - if token: - type_val = int(token["m_type"]) - type_str = TOKEN_TYPES.get(type_val, f"Unknown({type_val})") - print(f" Token type: {type_str}") - else: - print(f" mp_Token: ") except Exception as e: print(f"Error: {e}") @@ -928,18 +1033,10 @@ def __init__(self): def invoke(self, arg, from_tty): try: - # First, call DumpItem to print base class fields + # Call DumpItem which now handles derived-class details if self.dump_item is None: self.dump_item = DumpItem() self.dump_item.invoke(arg, from_tty) - - # Then print derived class fields - print(f"SetItem:") - val = gdb.parse_and_eval(arg) - raw_expr = std_string_to_str(val["m_rawExpression"]) - key = int(val["m_key"]) - print(f" Expression: \"{raw_expr}\"") - print(f" Key: {key}") except Exception as e: print(f"Error: {e}") @@ -952,21 +1049,10 @@ def __init__(self): def invoke(self, arg, from_tty): try: - # First, call DumpItem to print base class fields + # Call DumpItem which now handles derived-class details if self.dump_item is None: self.dump_item = DumpItem() self.dump_item.invoke(arg, from_tty) - - # Then print derived class fields - print(f"SkipItem:") - val = gdb.parse_and_eval(arg) - raw_expr = std_string_to_str(val["m_rawExpression"]) - is_until = bool(val["m_bIsUntil"]) - satisfied = bool(val["m_bSatisfied"]) - skip_type = "SKIPUNTIL" if is_until else "SKIPWHILE" - print(f" Type: {skip_type}") - print(f" Expression: \"{raw_expr}\"") - print(f" Satisfied: {satisfied}") except Exception as e: print(f"Error: {e}") @@ -979,21 +1065,10 @@ def __init__(self): def invoke(self, arg, from_tty): try: - # First, call DumpItem to print base class fields + # Call DumpItem which now handles derived-class details if self.dump_item is None: self.dump_item = DumpItem() self.dump_item.invoke(arg, from_tty) - - # Then print derived class fields - print(f"ConditionItem:") - val = gdb.parse_and_eval(arg) - pred = int(val["m_pred"]) - pred_str = CONDITION_PREDICATE.get(pred, f"Unknown({pred})") - raw_expr = std_string_to_str(val["m_rawExpression"]) - is_assn = bool(val["m_isAssignment"]) - print(f" Predicate: {pred_str}") - print(f" Expression: \"{raw_expr}\"") - print(f" Is Assignment: {is_assn}") except Exception as e: print(f"Error: {e}") @@ -1006,16 +1081,10 @@ def __init__(self): def invoke(self, arg, from_tty): try: - # First, call DumpItem to print base class fields + # Call DumpItem which now handles derived-class details if self.dump_item is None: self.dump_item = DumpItem() self.dump_item.invoke(arg, from_tty) - - # Then print derived class fields - print(f"BreakItem:") - val = gdb.parse_and_eval(arg) - ident = chr(int(val["m_identifier"])) - print(f" Identifier: {ident}") except Exception as e: print(f"Error: {e}") @@ -1028,16 +1097,10 @@ def __init__(self): def invoke(self, arg, from_tty): try: - # First, call DumpItem to print base class fields + # Call DumpItem which now handles derived-class details if self.dump_item is None: self.dump_item = DumpItem() self.dump_item.invoke(arg, from_tty) - - # Then print derived class fields - print(f"ContextItem:") - val = gdb.parse_and_eval(arg) - offset = int(val["m_offset"]) - print(f" Offset: {offset}") except Exception as e: print(f"Error: {e}") @@ -1050,18 +1113,10 @@ def __init__(self): def invoke(self, arg, from_tty): try: - # First, call DumpItem to print base class fields + # Call DumpItem which now handles derived-class details if self.dump_item is None: self.dump_item = DumpItem() self.dump_item.invoke(arg, from_tty) - - # Then print derived class fields - print(f"SelectItem:") - val = gdb.parse_and_eval(arg) - stream = int(val["m_stream"]) - b_output = bool(val["bOutput"]) - print(f" Stream: {stream}") - print(f" Output: {b_output}") except Exception as e: print(f"Error: {e}") @@ -1074,23 +1129,10 @@ def __init__(self): def invoke(self, arg, from_tty): try: - # First, call DumpItem to print base class fields + # Call DumpItem which now handles derived-class details if self.dump_item is None: self.dump_item = DumpItem() self.dump_item.invoke(arg, from_tty) - - # Then print derived class fields - print(f"SplitItem:") - val = gdb.parse_and_eval(arg) - is_field = bool(val["m_isField"]) - sep = std_string_to_str(val["m_separator"]) - splitting = bool(val["m_splitting"]) - current_piece = int(val["m_currentPiece"]) - split_type = "SPLITF" if is_field else "SPLITW" - print(f" Type: {split_type}") - print(f" Separator: \"{sep}\"") - print(f" Splitting: {splitting}") - print(f" Current Piece: {current_piece}") except Exception as e: print(f"Error: {e}") @@ -1720,6 +1762,48 @@ def invoke(self, arg, from_tty): except Exception as e: print(f"Error: {e}") +class DumpComputeStack(gdb.Command): + """Dump the compute stack (std::stack) with all its items.""" + + def __init__(self): + super(DumpComputeStack, self).__init__("dump-compute-stack", gdb.COMMAND_DATA) + + def invoke(self, arg, from_tty): + try: + val = gdb.parse_and_eval(arg) + + # Extract stack items using the std_stack_items utility + # Items are shared_ptr, so we need to dereference them + items = std_stack_items(val) + size = len(items) + + print(f"Compute Stack @ {val.address} with {size} items:") + + if size == 0: + print(" (empty)") + else: + for i, item in enumerate(items): + try: + # Dereference the shared_ptr + alu_value = deref_shared_ptr(item) + if alu_value is None: + print(f" [{i}] (nil)") + else: + type_val = int(alu_value["m_type"]) + type_str = ALU_COUNTER_TYPE.get(type_val, f"Unknown({type_val})") + val_str = std_string_to_str(alu_value["m_value"]) + exact = bool(alu_value["m_exact"]) + exact_str = " (exact)" if exact else " (inexact)" + + if type_val == 0: # counterType__None + print(f" [{i}] (nil)") + else: + print(f" [{i}] ({type_str}) {val_str}{exact_str}") + except Exception as item_e: + print(f" [{i}] (error: {item_e})") + except Exception as e: + print(f"Error: {e}") + # ============================================================================ # DUMP COMMANDS - Utilities # ============================================================================ @@ -2247,6 +2331,7 @@ def register_commands(): DumpAluVec() DumpAluValueStats() DumpFrequencyMap() + DumpComputeStack() # Python interface commands DumpAluFunction() From 6e10a4c709716b02967bde6323c8d768be8fa02a Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Tue, 2 Jun 2026 14:16:25 +0300 Subject: [PATCH 31/50] Issue #407 - Add the ctxoffset ALU function (#408) --- manpage | 4 ++++ specs/docs/alu_adv.md | 1 + specs/src/test/ProcessingTest.cc | 4 ++++ specs/src/utils/aluFunctions.cc | 5 +++++ specs/src/utils/aluFunctions.h | 2 ++ specs/tests/valgrind_unit_tests.py | 2 +- 6 files changed, 17 insertions(+), 1 deletion(-) diff --git a/manpage b/manpage index 66fb5ef..6ca73cf 100644 --- a/manpage +++ b/manpage @@ -973,6 +973,10 @@ does not. When no .I CONTEXT is active, returns the same value as .B recno(). +.IP "ctxoffset()" 3 +Returns the current effective context offset. When no +.I CONTEXT +is active, returns 0. .IP "record()" 3 Returns the entire input record. Equivalent to .B range(1,-1) diff --git a/specs/docs/alu_adv.md b/specs/docs/alu_adv.md index c7b03aa..764953f 100644 --- a/specs/docs/alu_adv.md +++ b/specs/docs/alu_adv.md @@ -165,6 +165,7 @@ All three regular expression functions have an argument called `matchFlags`. Thi | `range(n,m)` | Returns the substring from the *n*-th character (default first) to the *m*-th character (default last) | | `recno()` | Returns the number of the currently read record. If the `READ` or `READSTOP` keywords are used this may be greater than `number()` | | `ctxrecno()` | Returns the record number of the record that input parts work on. This is similar to `recno()`, but considers rolling context, which `recno()` does not. | +| `ctxoffset()` | Returns the current effective context offset. Returns 0 when no `CONTEXT` is in effect. | | `record()` | Returns the entire input record. Equivalent to `@!`. | | `cfrecord()` | Returns the entire input record, disregarding rolling context. Same as `record()` when `CONTEXT` is not in effect. Equivalent to `@@`. | | `word(n)` | Returns the *n*-th word | diff --git a/specs/src/test/ProcessingTest.cc b/specs/src/test/ProcessingTest.cc index 0d0bef0..9134d47 100644 --- a/specs/src/test/ProcessingTest.cc +++ b/specs/src/test/ProcessingTest.cc @@ -1124,6 +1124,10 @@ int main(int argc, char** argv) " Using cfrecord(): But I can't help falling in love with you"; VERIFY2(spec, strm.c_str(), res.c_str()); // TEST #260 + // ctxoffset() function test + spec = "PRINT 'ctxoffset()' 1 CONTEXT +1 PRINT 'ctxoffset()' NW CONTEXT -1 PRINT 'ctxoffset()' NW"; + VERIFY2(spec, "x", "0 1 -1"); // TEST #261 + if (errorCount) { std::cout << '\n' << errorCount << '/' << testCount << " tests failed.\n"; std::cout << "Failed tests: "; diff --git a/specs/src/utils/aluFunctions.cc b/specs/src/utils/aluFunctions.cc index e689be3..4a68192 100644 --- a/specs/src/utils/aluFunctions.cc +++ b/specs/src/utils/aluFunctions.cc @@ -419,6 +419,11 @@ PValue AluFunc_ctxrecno() return mkValue(g_pStateQueryAgent->getRecordCount() + g_pStateQueryAgent->getContextOffset()); } +PValue AluFunc_ctxoffset() +{ + return mkValue(g_pStateQueryAgent->getContextOffset()); +} + PValue AluFunc_eof() { bool isRunOut = g_pStateQueryAgent->isEOF(); diff --git a/specs/src/utils/aluFunctions.h b/specs/src/utils/aluFunctions.h index d21bdb9..619e7af 100644 --- a/specs/src/utils/aluFunctions.h +++ b/specs/src/utils/aluFunctions.h @@ -42,6 +42,8 @@ "() - Returns the record number of the current record.","Increments with every READ or READSTOP.") \ X(ctxrecno, 0, ALUFUNC_REGULAR, true, \ "() - Returns the record number of the record that input parts work on.","This is similar to recno, but considers rolling context, which recno does not.") \ + X(ctxoffset, 0, ALUFUNC_REGULAR, true, \ + "() - Returns the current effective context offset.","Returns 0 when no CONTEXT is in effect.") \ X(number, 0, ALUFUNC_REGULAR, true, \ "() - Returns the number of times this specification has restarted","Does not increment with READ or READSTOP. Otherwise similar to recno().") \ X(eof, 0, ALUFUNC_REGULAR, false, \ diff --git a/specs/tests/valgrind_unit_tests.py b/specs/tests/valgrind_unit_tests.py index acd636c..e87fc50 100644 --- a/specs/tests/valgrind_unit_tests.py +++ b/specs/tests/valgrind_unit_tests.py @@ -1,7 +1,7 @@ import sys, memcheck, argparse count_ALU_tests = 832 -count_processing_tests = 260 +count_processing_tests = 261 count_token_tests = 17 # Parse the one command line options From 9fcc9f5912990d707c5084534a5f8b8577c60c34 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Wed, 3 Jun 2026 23:08:22 +0300 Subject: [PATCH 32/50] Issue #410 - eliminate x86_64 from Mac OS build (#411) --- specs/src/setup.py | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/specs/src/setup.py b/specs/src/setup.py index 6221f9d..3cac026 100644 --- a/specs/src/setup.py +++ b/specs/src/setup.py @@ -127,9 +127,30 @@ def python_search(arg): with open("xx.txt", "r") as flags: filtered_flags = ['-g', '-O0', '-O1', '-O2', '-O3', '-Wstrict-prototypes'] filtered_flags_debug = ['-g', '-O0', '-O1', '-O2', '-O3', '-Wstrict-prototypes', '-Wp,-D_FORTIFY_SOURCE=2'] + # Multi-word flags to filter: list of (flag, value) tuples to remove + # For example: ('-arch', 'x86_64') will remove "-arch x86_64" but keep "-arch arm64" + filtered_multi_flags = [('-arch', 'x86_64')] cflags=flags.read().strip().split() filter = filtered_flags_debug if variation=="DEBUG" else filtered_flags - filtered_cflags = [f for f in cflags if f not in filter] + filtered_cflags = [] + i = 0 + while i < len(cflags): + flag = cflags[i] + # Check if this flag should be filtered out + should_filter = False + if flag in filter: + should_filter = True + else: + # Check multi-word flags + for multi_flag, multi_value in filtered_multi_flags: + if flag == multi_flag and i + 1 < len(cflags) and cflags[i + 1] == multi_value: + should_filter = True + i += 1 # Skip the next token (the value) + break + + if not should_filter: + filtered_cflags.append(flag) + i += 1 python_cflags = " ".join(filtered_cflags) + " -Wno-deprecated-register -fPIC" # Get the result of python-config --ldflags From a847156f546e1376b4e88e5400107fff0224699d Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Wed, 3 Jun 2026 23:11:13 +0300 Subject: [PATCH 33/50] Issue #406 - the new ALU function ctxoob (#409) --- manpage | 91 +++++++++++++++++++++++++ specs/docs/alu_adv.md | 20 ++++++ specs/src/processing/ProcessingState.cc | 13 +++- specs/src/processing/ProcessingState.h | 1 + specs/src/processing/Reader.cc | 9 +-- specs/src/processing/Reader.h | 4 ++ specs/src/specitems/InputPart.cc | 5 ++ specs/src/specitems/specItems.cc | 2 +- specs/src/test/ProcessingTest.cc | 64 +++++++++++++++++ specs/src/utils/alu.cc | 15 ++++ specs/src/utils/alu.h | 5 ++ specs/src/utils/aluFunctions.cc | 18 +++++ specs/src/utils/aluFunctions.h | 2 + specs/tests/valgrind_unit_tests.py | 2 +- 14 files changed, 244 insertions(+), 7 deletions(-) diff --git a/manpage b/manpage index 6ca73cf..067c016 100644 --- a/manpage +++ b/manpage @@ -420,6 +420,87 @@ does not cause processing to stop, even if a .B READSTOP token is present in the specification. +.SS "Out-of-Bounds (OOB) Records" +When +.B CONTEXT +or an +.B @\(+-n +expression refers to a record beyond the beginning or end of the input, or when a +.B READ +runs dry, the working string becomes an +.I out-of-bounds +(OOB) record. An OOB record behaves as an empty string, but it additionally carries a hidden flag marking it as out of bounds. This flag lets you tell the difference between a record that is genuinely empty and one that is empty only because it lies past the edge of the input. The flag is queried with the +.B ctxoob() +function (see +.B "Built-In Functions" ). +.P +The OOB flag travels with the working string only as long as the OOB record is read +.I directly. +It is preserved by the following: +.RS +.IP \(bu 3 +.B ctxoob() +with no argument (it inspects the current working string). +.IP \(bu 3 +The record-access functions +.B record(), +.B range(), +.B substr(), +.B word(), +.B wordrange(), +.B field() +and +.B fieldrange() +when they operate on the current (OOB) record. +.IP \(bu 3 +The +.B @@, +.B @! +and +.B @\(+-n +input-record expressions. +.IP \(bu 3 +Character, word and field range labels (for example +.B "1-5 a:" , +.B "w1-3 x:" +or +.B "f1 y:" ) +that capture the OOB record. The captured label, e.g. +.B a, +remains OOB and can be tested with +.B ctxoob(a). +.RE +.P +The OOB flag is +.B not +preserved once the value is copied into a plain text value that no longer refers to the working string. Known cases where the flag is lost include: +.RS +.IP \(bu 3 +Assigning into a numbered counter with a +.B SET +spec unit (or any +.B := +assignment). For example, after +.B "CONTEXT 1 set #5:=record()" , +the expression +.B ctxoob(#5) +returns 0, because the counter stores only the (empty) text and not the OOB flag. +.IP \(bu 3 +Values that pass through an output-producing spec unit such as +.B PRINT +before being re-captured. For example, in +.B "CONTEXT 1 1-5 a: PRINT \(dqa\(dq b: PRINT \(dqctxoob(b)\(dq" , +the label +.B b +is no longer OOB and +.B ctxoob(b) +returns 0. +.IP \(bu 3 +Other operations that materialize the working string as ordinary text may likewise drop the flag. +.RE +.P +In short: query OOB status as close as possible to where the OOB record is read, and do not expect it to survive a round trip through a counter or other plain-text storage. + .SS "MainOptions" These are optional spec units that appear at the beginning of the specification and modify the behavior of the entire specification. .IP "STOP" 3 @@ -977,6 +1058,16 @@ is active, returns the same value as Returns the current effective context offset. When no .I CONTEXT is active, returns 0. +.IP "ctxoob()" 3 +Returns 1 if the argument string came from out-of-bounds input (either via +.I CONTEXT +or +.I @±n +reading past the input edges, or via a +.I READ +that ran dry), and 0 otherwise. With no argument, checks the current (context-affected) record. The out-of-bounds property is not preserved by all operations; see the subsection +.B "Out-of-Bounds (OOB) Records" +for details and limitations. .IP "record()" 3 Returns the entire input record. Equivalent to .B range(1,-1) diff --git a/specs/docs/alu_adv.md b/specs/docs/alu_adv.md index 764953f..7e0b9b9 100644 --- a/specs/docs/alu_adv.md +++ b/specs/docs/alu_adv.md @@ -166,6 +166,7 @@ All three regular expression functions have an argument called `matchFlags`. Thi | `recno()` | Returns the number of the currently read record. If the `READ` or `READSTOP` keywords are used this may be greater than `number()` | | `ctxrecno()` | Returns the record number of the record that input parts work on. This is similar to `recno()`, but considers rolling context, which `recno()` does not. | | `ctxoffset()` | Returns the current effective context offset. Returns 0 when no `CONTEXT` is in effect. | +| `ctxoob(s)` | Returns 1 if the argument string came from out-of-bounds input, 0 otherwise. With no argument, checks the current (context-affected) record. The out-of-bounds property is not preserved by all operations -- see [Out-of-Bounds (OOB) Records](#out-of-bounds-oob-records) below. | | `record()` | Returns the entire input record. Equivalent to `@!`. | | `cfrecord()` | Returns the entire input record, disregarding rolling context. Same as `record()` when `CONTEXT` is not in effect. Equivalent to `@@`. | | `word(n)` | Returns the *n*-th word | @@ -239,6 +240,25 @@ The parameters for the `fmap_dump` functions are as follows: | `next()` | Returns the index of the print position. `w1 "(next())"` should do the same as `w1 next`. | | `exact(expression)` | Returns `1` if the evaluation of the `expression` results in an exact value, or `0` if some rounding and/or loss of precision has been involved in the computaion. The function has some limitations and will err on the side of returning `0` when it's unsure. | +## Out-of-Bounds (OOB) Records + +When `CONTEXT` or a `@±n` expression refers to a record beyond the beginning or end of the input, or when a `READ` runs dry, the working string becomes an *out-of-bounds* (OOB) record. An OOB record behaves as an empty string, but it additionally carries a hidden flag marking it as out of bounds. This flag lets you distinguish a record that is genuinely empty from one that is empty only because it lies past the edge of the input. The flag can be queried with the `ctxoob()` function. + +The OOB flag travels with the working string only as long as the OOB record is read **directly**. It is *preserved* by: + +- `ctxoob()` with no argument (it inspects the current working string). +- The record-access functions `record()`, `range()`, `substr()`, `word()`, `wordrange()`, `field()` and `fieldrange()` when they operate on the current (OOB) record. +- The `@@`, `@!` and `@±n` input-record expressions. +- Character, word and field range labels (for example `1-5 a:`, `w1-3 x:` or `f1 y:`) that capture the OOB record. The captured label, e.g. `a`, remains OOB and can be tested with `ctxoob(a)`. + +The OOB flag is **not** preserved once the value is copied into a plain-text value that no longer refers to the working string. Known cases where the flag is lost include: + +- **Assignment into a numbered counter** with a `SET` spec unit (or any `:=` assignment). For example, after `CONTEXT 1 set "#5:=record()"`, the expression `ctxoob(#5)` returns `0`, because the counter stores only the (empty) text and not the OOB flag. +- **Values that pass through an output-producing spec unit** such as `PRINT` before being re-captured. For example, in `CONTEXT 1 1-5 a: PRINT "a" b: PRINT "ctxoob(b)"`, the label `b` is no longer OOB and `ctxoob(b)` returns `0`. +- Other operations that materialize the working string as ordinary text may likewise drop the flag. + +In short: query OOB status as close as possible to where the OOB record is read, and do not expect it to survive a round trip through a counter or other plain-text storage. + diff --git a/specs/src/processing/ProcessingState.cc b/specs/src/processing/ProcessingState.cc index 02cdbee..a6e4b4a 100644 --- a/specs/src/processing/ProcessingState.cc +++ b/specs/src/processing/ProcessingState.cc @@ -366,6 +366,10 @@ PSpecString ProcessingState::getFromTo(int from, int to) if (m_inputStation != STATION_SECOND && nullptr==m_ps) { return std::make_shared(); } + // If current record is OOB, preserve OOB status + if (Reader::isOOBRecord(currRecord())) { + return currRecord(); + } int slen = (int)(currRecord()->length()); if (0==from && 0==to) return std::make_shared(); @@ -426,7 +430,8 @@ void ProcessingState::fieldIdentifierSet(char id, PSpecString ps) std::cerr << "WARNING: Field Identifier <" << id << "> redefined.\n"; } - m_fieldIdentifiers[id] = std::make_shared(*ps); + // Store the PSpecString directly to preserve OOB status + m_fieldIdentifiers[id] = ps; // Count the statistics of this field value. if (ALUFUNC_STATISTICAL & AluFunction::functionTypes()) { @@ -594,3 +599,9 @@ std::string ProcessingStateFieldIdentifierGetter::Get(char id) PSpecString ret = m_ps->fieldIdentifierGet(id); return std::string(ret->data(), ret->length()); } + +bool ProcessingStateFieldIdentifierGetter::isOOB(char id) +{ + if (!m_ps->fieldIdentifierIsSet(id)) return false; + return Reader::isOOBRecord(m_ps->fieldIdentifierGet(id)); +} diff --git a/specs/src/processing/ProcessingState.h b/specs/src/processing/ProcessingState.h index 931690a..c789dca 100644 --- a/specs/src/processing/ProcessingState.h +++ b/specs/src/processing/ProcessingState.h @@ -154,6 +154,7 @@ class ProcessingStateFieldIdentifierGetter : public fieldIdentifierGetter { ProcessingStateFieldIdentifierGetter(ProcessingState* _ps) : m_ps(_ps) {} ~ProcessingStateFieldIdentifierGetter() override {} std::string Get(char id) override; + bool isOOB(char id) override; private: ProcessingState* m_ps; }; diff --git a/specs/src/processing/Reader.cc b/specs/src/processing/Reader.cc index 9164ed5..160899a 100644 --- a/specs/src/processing/Reader.cc +++ b/specs/src/processing/Reader.cc @@ -6,6 +6,7 @@ uint64_t g_readRecordCounter = 0; Reader* g_pReader = nullptr; +PSpecString g_pOOBSpecString = std::make_shared(); void ReadAllRecordsIntoReaderQueue(Reader* r) { @@ -217,16 +218,16 @@ void StandardReader::setContextSizes(unsigned int forward, unsigned int backward PSpecString StandardReader::peek(int offset) { if (offset == 0) { - return m_currentRecord ? m_currentRecord : std::make_shared(); + return m_currentRecord ? m_currentRecord : g_pOOBSpecString; } if (offset < 0) { unsigned int idx = (unsigned int)(-offset) - 1; - if (idx >= m_backwardBuffer.size()) return std::make_shared(); + if (idx >= m_backwardBuffer.size()) return g_pOOBSpecString; return m_backwardBuffer[m_backwardBuffer.size() - 1 - idx]; } // offset > 0 unsigned int idx = (unsigned int)offset - 1; - if (idx >= m_forwardBuffer.size()) return std::make_shared(); + if (idx >= m_forwardBuffer.size()) return g_pOOBSpecString; return m_forwardBuffer[idx]; } @@ -399,7 +400,7 @@ PSpecString TestReader::peek(int offset) // m_idx points to the *next* record to read, so current record is m_idx-1 int target = int(m_idx) - 1 + offset; if (target < 0 || target >= int(m_count)) { - return std::make_shared(); // empty string for out-of-bounds + return g_pOOBSpecString; // sentinel for out-of-bounds } return mp_arr[target]; } diff --git a/specs/src/processing/Reader.h b/specs/src/processing/Reader.h index e398b12..ab50991 100644 --- a/specs/src/processing/Reader.h +++ b/specs/src/processing/Reader.h @@ -13,6 +13,9 @@ #define STOP_STREAM_INVALID 99 #define IS_SPECIFIC_STREAM(x) ((x) mp_thread; diff --git a/specs/src/specitems/InputPart.cc b/specs/src/specitems/InputPart.cc index 11b74ec..272e421 100644 --- a/specs/src/specitems/InputPart.cc +++ b/specs/src/specitems/InputPart.cc @@ -2,6 +2,7 @@ #include "processing/Config.h" #include "utils/TimeUtils.h" #include "utils/ErrorReporting.h" +#include "processing/Reader.h" #include "item.h" #include @@ -54,6 +55,8 @@ std::string WordRangePart::Debug() PSpecString WordRangePart::getStr(ProcessingState& pState) { if (pState.recordNotAvailable()) return std::make_shared(); + // If current record is OOB, preserve OOB status + if (Reader::isOOBRecord(pState.currRecord())) return pState.currRecord(); std::string keepSeparator(DEFAULT_WORDSEPARATOR); if (!m_WordSep.empty()) { keepSeparator = pState.getWSChars(); @@ -87,6 +90,8 @@ std::string FieldRangePart::Debug() PSpecString FieldRangePart::getStr(ProcessingState& pState) { if (pState.recordNotAvailable()) return std::make_shared(); + // If current record is OOB, preserve OOB status + if (Reader::isOOBRecord(pState.currRecord())) return pState.currRecord(); std::string keepSeparator(DEFAULT_FIELDSEPARATOR); if (!m_FieldSep.empty()) { keepSeparator = pState.getFSChars(); diff --git a/specs/src/specitems/specItems.cc b/specs/src/specitems/specItems.cc index f708536..0e6bb4e 100644 --- a/specs/src/specitems/specItems.cc +++ b/specs/src/specitems/specItems.cc @@ -682,7 +682,7 @@ bool itemGroup::processDo(StringBuilder& sb, ProcessingState& pState, Reader* pR ps = pRd->get(tmr, rdrCounter); if (!ps) { if (aRet==ApplyRet__Read) { - ps = std::make_shared(); + ps = g_pOOBSpecString; } else { processingContinue = false; // Stop processing if no extra record is available } diff --git a/specs/src/test/ProcessingTest.cc b/specs/src/test/ProcessingTest.cc index 9134d47..652ee12 100644 --- a/specs/src/test/ProcessingTest.cc +++ b/specs/src/test/ProcessingTest.cc @@ -1128,6 +1128,70 @@ int main(int argc, char** argv) spec = "PRINT 'ctxoffset()' 1 CONTEXT +1 PRINT 'ctxoffset()' NW CONTEXT -1 PRINT 'ctxoffset()' NW"; VERIFY2(spec, "x", "0 1 -1"); // TEST #261 + // ctxoob() function test - CONTEXT-based + spec = "PRINT 'ctxoob()' 1 CONTEXT +1 PRINT 'ctxoob()' NW CONTEXT -1 PRINT 'ctxoob()' NW"; + VERIFY2(spec, "x", "0 1 1"); // TEST #262 + + // ctxoob() function test - @± expression-based + spec = "PRINT 'ctxoob(@@)' 1 PRINT 'ctxoob(@+1)' NW PRINT 'ctxoob(@-1)' NW"; + VERIFY2(spec, "x", "0 1 1"); // TEST #263 + + // ctxoob() function test - OOB status preserved through record() + spec = "CONTEXT +1 PRINT 'ctxoob(record())'"; + VERIFY2(spec, "hello", "1"); // TEST #264 + + // ctxoob() function test - OOB status preserved through word() + spec = "CONTEXT +1 PRINT 'ctxoob(word(1))'"; + VERIFY2(spec, "hello world", "1"); // TEST #265 + + // ctxoob() function test - normal record() should return 0 + spec = "PRINT 'ctxoob(record())'"; + VERIFY2(spec, "hello", "0"); // TEST #266 + + // ctxoob() function test - normal word() should return 0 + spec = "PRINT 'ctxoob(word(1))'"; + VERIFY2(spec, "hello world", "0"); // TEST #267 + + // ctxoob() function test - OOB status preserved through range() + spec = "CONTEXT +1 PRINT 'ctxoob(range(1,3))'"; + VERIFY2(spec, "hello", "1"); // TEST #268 + + // ctxoob() function test - normal range() should return 0 + spec = "PRINT 'ctxoob(range(1,3))'"; + VERIFY2(spec, "hello", "0"); // TEST #269 + + // ctxoob() function test - OOB status preserved through substr() of current record + spec = "CONTEXT +1 PRINT 'ctxoob(substr(,1,5))'"; + VERIFY2(spec, "hello", "1"); // TEST #270 + + // ctxoob() function test - normal substr() should return 0 + spec = "PRINT 'ctxoob(substr(,1,5))'"; + VERIFY2(spec, "hello", "0"); // TEST #271 + + // ctxoob() function test - OOB status preserved through range-label variable + spec = "CONTEXT +1 1-5 a: PRINT 'ctxoob(a)'"; + VERIFY2(spec, "hello", "1"); // TEST #272 + + // ctxoob() function test - normal range-label variable should return 0 + spec = "1-5 a: PRINT 'ctxoob(a)'"; + VERIFY2(spec, "hello", "0"); // TEST #273 + + // ctxoob() function test - OOB status preserved through word-range label + spec = "CONTEXT +1 w1-3 x: PRINT 'ctxoob(x)'"; + VERIFY2(spec, "a b c", "1"); // TEST #274 + + // ctxoob() function test - normal word-range label should return 0 + spec = "w1-3 x: PRINT 'ctxoob(x)'"; + VERIFY2(spec, "a b c", "0"); // TEST #275 + + // ctxoob() function test - OOB status preserved through field-range label + spec = "fs : CONTEXT +1 f1 y: PRINT 'ctxoob(y)'"; + VERIFY2(spec, "a:b:c", "1"); // TEST #276 + + // ctxoob() function test - normal field-range label should return 0 + spec = "fs : f1 y: PRINT 'ctxoob(y)'"; + VERIFY2(spec, "a:b:c", "0"); // TEST #277 + if (errorCount) { std::cout << '\n' << errorCount << '/' << testCount << " tests failed.\n"; std::cout << "Failed tests: "; diff --git a/specs/src/utils/alu.cc b/specs/src/utils/alu.cc index 50683a5..da952f8 100644 --- a/specs/src/utils/alu.cc +++ b/specs/src/utils/alu.cc @@ -22,6 +22,14 @@ extern stateQueryAgent* g_pStateQueryAgent; extern unsigned int g_forwardContext; extern unsigned int g_backwardContext; +// Sentinel pointer for out-of-bounds values +PValue g_pOOBValue = std::make_shared(std::string("")); + +bool isOOBValue(PValue pv) +{ + return pv == g_pOOBValue; +} + void ALUValue::set(std::string& s) { m_value = s; @@ -287,6 +295,9 @@ PValue AluUnitFieldIdentifier::evaluate() if (!g_fieldIdentifierGetter) { MYTHROW("Field Identifier Getter is not set") } + if (g_fieldIdentifierGetter->isOOB(m_id)) { + return g_pOOBValue; + } return mkValue(g_fieldIdentifierGetter->Get(m_id)); } @@ -894,6 +905,10 @@ PValue AluInputRecord::evaluate() MYASSERT_WITH_MSG(g_pReader != nullptr, "Rolling context requires a reader"); ps = g_pReader->peek(m_offset); } + // Check if this is an out-of-bounds record + if (Reader::isOOBRecord(ps)) { + return g_pOOBValue; + } PValue ret; if (ps) { ret = mkValue2(ps->data(), int(ps->length())); diff --git a/specs/src/utils/alu.h b/specs/src/utils/alu.h index 3a180e2..44afb73 100644 --- a/specs/src/utils/alu.h +++ b/specs/src/utils/alu.h @@ -12,6 +12,10 @@ std::ostream& operator<< (std::ostream& os, const ALUValue &c); +// Sentinel pointer for out-of-bounds values +extern PValue g_pOOBValue; +bool isOOBValue(PValue pv); + typedef unsigned int ALUCounterKey; class ALUCounters { @@ -160,6 +164,7 @@ class fieldIdentifierGetter { public: virtual ~fieldIdentifierGetter() {} virtual std::string Get(char id) = 0; + virtual bool isOOB(char id) {return false;} }; void setFieldIdentifierGetter(fieldIdentifierGetter* getter); diff --git a/specs/src/utils/aluFunctions.cc b/specs/src/utils/aluFunctions.cc index 4a68192..b288ddc 100644 --- a/specs/src/utils/aluFunctions.cc +++ b/specs/src/utils/aluFunctions.cc @@ -6,6 +6,7 @@ #include "processing/Config.h" #include "processing/persistent.h" #include "processing/ProcessingState.h" +#include "processing/Reader.h" #include #include #include @@ -424,6 +425,15 @@ PValue AluFunc_ctxoffset() return mkValue(g_pStateQueryAgent->getContextOffset()); } +PValue AluFunc_ctxoob(PValue pArg) +{ + if (nullptr == pArg) { + return mkValue(ALUInt(Reader::isOOBRecord(g_pStateQueryAgent->currRecord()) ? 1 : 0)); + } else { + return mkValue(ALUInt(isOOBValue(pArg) ? 1 : 0)); + } +} + PValue AluFunc_eof() { bool isRunOut = g_pStateQueryAgent->isEOF(); @@ -491,6 +501,10 @@ PValue AluFunc_fieldcount(PValue pStr, PValue pSep) // Helper function static PValue AluFunc_range(ALUInt start, ALUInt end) { + // If the current record is out-of-bounds, preserve that status + if (Reader::isOOBRecord(g_pStateQueryAgent->currRecord())) { + return g_pOOBValue; + } PSpecString pRange = g_pStateQueryAgent->getFromTo(start, end); if (pRange) { PValue pRet = mkValue(pRange->data()); @@ -672,6 +686,10 @@ static PValue AluFunc_substring_do(std::string* pStr, ALUInt start, ALUInt lengt PValue AluFunc_substr(PValue pBigString, PValue pStart, PValue pLength) { ASSERT_ARG_OR_RECORD(pBigString,1,str); + // If no argument provided and current record is OOB, preserve OOB status + if (!pBigString && Reader::isOOBRecord(g_pStateQueryAgent->currRecord())) { + return g_pOOBValue; + } std::string* pBigStr = (pBigString) ? pBigString->getStrPtr() : g_pStateQueryAgent->currRecord().get(); ALUInt start = ARG_INT_WITH_DEFAULT(pStart,1); ALUInt length = ARG_INT_WITH_DEFAULT(pLength,-1); diff --git a/specs/src/utils/aluFunctions.h b/specs/src/utils/aluFunctions.h index 619e7af..15f20f2 100644 --- a/specs/src/utils/aluFunctions.h +++ b/specs/src/utils/aluFunctions.h @@ -44,6 +44,8 @@ "() - Returns the record number of the record that input parts work on.","This is similar to recno, but considers rolling context, which recno does not.") \ X(ctxoffset, 0, ALUFUNC_REGULAR, true, \ "() - Returns the current effective context offset.","Returns 0 when no CONTEXT is in effect.") \ + X(ctxoob, 1, ALUFUNC_REGULAR, false, \ + "(s) - Returns 1 if the argument string came from out-of-bounds input, 0 otherwise.","With no argument, checks the current (context-affected) record.") \ X(number, 0, ALUFUNC_REGULAR, true, \ "() - Returns the number of times this specification has restarted","Does not increment with READ or READSTOP. Otherwise similar to recno().") \ X(eof, 0, ALUFUNC_REGULAR, false, \ diff --git a/specs/tests/valgrind_unit_tests.py b/specs/tests/valgrind_unit_tests.py index e87fc50..aabf596 100644 --- a/specs/tests/valgrind_unit_tests.py +++ b/specs/tests/valgrind_unit_tests.py @@ -1,7 +1,7 @@ import sys, memcheck, argparse count_ALU_tests = 832 -count_processing_tests = 261 +count_processing_tests = 277 count_token_tests = 17 # Parse the one command line options From 6d09ed30e075351b3594f6f57facb4ab1903e512 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Thu, 4 Jun 2026 12:03:40 +0300 Subject: [PATCH 34/50] Issue #413 - add the not function (#414) --- manpage | 10 +++++++++- specs/docs/alu_adv.md | 7 ++++--- specs/src/test/ALUUnitTest.cc | 15 +++++++++++++++ specs/src/utils/aluFunctions.cc | 14 ++++++++++++++ specs/src/utils/aluFunctions.h | 4 +++- specs/tests/valgrind_unit_tests.py | 2 +- 6 files changed, 46 insertions(+), 6 deletions(-) diff --git a/manpage b/manpage index 067c016..3d97907 100644 --- a/manpage +++ b/manpage @@ -799,7 +799,9 @@ Expressions are used in PRINT data units as well as in assignments. Assignments Expressions are made up of numbers, field identifiers (no colon needed), and counter numbers preceded by a hash mark. They allow ordinary arithmetic and logical operations as well as function calls for pre-defined functions: .IP "Unary Operators" 3 -+ (plus - does nothing), - (minus), ! (logical NOT) ++ (plus - does nothing), - (minus), ! (logical NOT - equivalent to the +.I not +built-in function) .IP "Binary Arithmetic Operators" 3 +, -, *. / (division), // (integer division), % (remainder) .IP "Binary String Operator" 3 @@ -1595,6 +1597,12 @@ is positive, 0 if the is 0, and -1 if the .I number is negative. +.IP "not(expression)" 3 +Returns 1 if the +.I expression +is an integer zero, or 0 if it is anything else. It serves as an alternative to the +.B unary logical not +operator (!). .IP "space(string,length,pad)" 3 Formats a .I string diff --git a/specs/docs/alu_adv.md b/specs/docs/alu_adv.md index 7e0b9b9..ffac8e0 100644 --- a/specs/docs/alu_adv.md +++ b/specs/docs/alu_adv.md @@ -6,7 +6,7 @@ | -- | ---- | ------- | | `+` | Unary Plus | | | `-` | Unary Minus | Negates its operand, so if `a` is 5.3 then `-a` is -5.3 | -| `!` | Unary Not | Logical Not. If the result is zero, returns `1`, otherwise returns zero | +| `!` | Unary Not | Logical Not. If the result is zero, returns `1`, otherwise returns zero. Equivalent to the `not` built-in function (see below) | | `+` | Binary Plus | Returns the sum of its two operands | | `-` | Binary Minus | Returns the difference between the left-hand operand and the right-hand operand | | `*` | Binary Multiplication | Returns the product of its two operands | @@ -42,7 +42,7 @@ | `%=` | RemDiv | Divides the value of the left-hand counter by the right-hand counter, storing the **remainder** in that counter | | `\|\|=` | Appnd | Appends the string value of the right-hand operand to the string value of the left-hand counter, storing the concatenation in that counter | -## Table of Numerical Functions +## Table of Numerical and Logical Functions | Function | Description | | -------- | ----------- | | `abs(x)` | Returns the absolute value of `x` | @@ -77,6 +77,8 @@ | `tan(x)` | Returns the tangent function, treating `x` as an angle expressed in radians | | `tobin(x)` | Returns a binary (usually unprintable) representation of the integer number x. For example, if `x` is 65 the function returns "A"; if `x` is 16961 the function returns "AB". | | `tobine(x,n)` | Returns a binary representation of the integer number x as an *n*-byte string. | +| `sign(number)` | Returns 1 if the `number` is positive, 0 if the `number` is 0, and -1 if the `number` is negative. | +| `not(expr)` | Returns 1 if the `expr` is an integer zero, or 0 if it is anything else. It serves as an alternative to the **unary logical not** operator (!). | ## Table of String Functions | Function | Description | @@ -135,7 +137,6 @@ All three regular expression functions have an argument called `matchFlags`. Thi | `justify(string,length,pad)` | Evenly justifies words within `string`. The `length` specifies the length of the returned string, while `pad` specifies what padding (by default a space) to insert (if necessary). | | `overlay(string1, string2 ,start ,length ,pad)` | Returns a copy of `string2`, partially or fully overwritten by `string1`. `start` specifies the starting position of the overlay. `length` truncates or pads `string1` prior to the operation, using `pad` as the pad character. | | `reverse(string)` | Returns a copy of a `string` with its characters reversed. | -| `sign(number)` | Returns 1 if the `number` is positive, 0 if the `number` is 0, and -1 if the `number` is negative. | | `space(string,length,pad)` | Formats a `string` by replacing internal blanks with `length` occurrences of the `pad` character. The default pad character is blank and the default length is 1. Leading and trailing blanks are always removed. If `length` is 0, all blanks are removed. | | strip(string,option,pad-chars) | Returns `string` stripped of leading and/or trailing blanks or any other character specified in the `pad-chars` string. `Option` values determine the action: *L* for leading, *T* for trailing, and *B* for both (the default) | | `subword(string,start,length)` | Returns the substring that begins at blank-delimited word `start`. If `length` is omitted, it defaults to the remainder of the string. | diff --git a/specs/src/test/ALUUnitTest.cc b/specs/src/test/ALUUnitTest.cc index e7fb033..53023e7 100644 --- a/specs/src/test/ALUUnitTest.cc +++ b/specs/src/test/ALUUnitTest.cc @@ -1650,6 +1650,21 @@ int runALUUnitTests16(unsigned int onlyTest) VERIFY_ASSN_RES("#10:=pget(unitTestVar)", "2"); VERIFY_EXPR_RES("exact(#10)", "0"); + // not() function + std::cout << "\nThe not() function\n======================\n\n"; + VERIFY_EXPR_RES("not(1)", "0"); + VERIFY_EXPR_RES("not(0)", "1"); + VERIFY_EXPR_RES("not(3.1415)", "0"); + VERIFY_EXPR_RES("not(0.3333)", "0"); + VERIFY_EXPR_RES("not('hello')", "0"); + VERIFY_EXPR_RES("not('')", "0"); + VERIFY_EXPR_RES("not(2+2==4)", "0"); + VERIFY_EXPR_RES("not(2>3)", "1"); + VERIFY_EXPR_RES("not(exact(#10))", "1"); + VERIFY_EXPR_RES("not(includes(raid,'i'))", "0"); + VERIFY_EXPR_RES("not(includes(team,'i'))", "1"); // proving that there is really no 'i' in team + + if (countFailures) { std::cout << "\n*** " << countFailures << " of " << testIndex << " tests failed.\n"; std::cout << "Failed tests:\n"; diff --git a/specs/src/utils/aluFunctions.cc b/specs/src/utils/aluFunctions.cc index b288ddc..0eb302c 100644 --- a/specs/src/utils/aluFunctions.cc +++ b/specs/src/utils/aluFunctions.cc @@ -2394,6 +2394,20 @@ PValue AluFunc_sign(PValue pNumber) return mkValue(ret); } +PValue AluFunc_not(PValue pNumber) +{ + ASSERT_NOT_ELIDED(pNumber,1,number); + ALUInt ret = 0; + switch (pNumber->getDivinedType()) { + case counterType__Int: + ret = (0 == pNumber->getInt() ? 1 : 0); + break; + default: + ret = 0; + } + return mkValue(ret); +} + PValue AluFunc_space(PValue pStr, PValue pLength, PValue pPad) { ASSERT_NOT_ELIDED(pStr,1,string); diff --git a/specs/src/utils/aluFunctions.h b/specs/src/utils/aluFunctions.h index 15f20f2..b67e9ba 100644 --- a/specs/src/utils/aluFunctions.h +++ b/specs/src/utils/aluFunctions.h @@ -206,7 +206,7 @@ "(fid,elem) - Notes an occurence of the value in 'elem' for field identifier 'fid', and returns the number of occurences so far.","This is the only one of the fmap_* functions that modifies the frequency map.\nIt also affects the other statistics functions.") \ X(fmap_dump, 4, ALUFUNC_FREQUENCY, false, \ "(fid,fmt,order,pct) - Returns a multi-line string with the frequency map of field identifier 'fid'.","Only provides information relevant to the entire data set during the run-out cycle.\nFormat can be 'txt' or '0' for a textual table; 'lin' for a table with lines, and 'csv' or 'json' for those formats.\nOrder is 's'/'sa' to sort by ascending value, or 'sd' for descending, 'c'/'ca' for sorting by ascending count, or 'cd' for descending.\n'pct' adds a percentage column if true.") \ - H(Advanced Math Functions,20) \ + H(Advanced Math Functions,21) \ X(rand, 1, ALUFUNC_REGULAR, false, \ "([limit]) - Returns a random integer up to (but not including) 'limit'.","If 'limit' is omitted, returns a floating point number between 0 and 1.") \ X(floor, 1, ALUFUNC_REGULAR, false, \ @@ -258,6 +258,8 @@ "(s1,s2) - Returns a bit-wise XOR of the two strings s1 and s2.","If the strings are not equal in length, the result has the length of the shorter one.\nIf an operand is not a string, it is converted to a decimal string representation.") \ X(sign, 1, ALUFUNC_REGULAR, false, \ "(x) - Returns -1/0/1 for negative/zero/positive x.","") \ + X(not, 1, ALUFUNC_REGULAR, false, \ + "(x) - Returns 1 if x is zero, or 0 otherwise. Serves as a logical NOT, an alternative to using the unary operator.","") \ X(space, 3, ALUFUNC_REGULAR, false, \ "(str,[len],[pad]) - Formats 'str' by replacing internal blanks with 'len' occurrences of the 'pad' character.","'len' defaults to 1. 'pad' defaults to a space.") \ X(strip, 3, ALUFUNC_REGULAR, false, \ diff --git a/specs/tests/valgrind_unit_tests.py b/specs/tests/valgrind_unit_tests.py index aabf596..7d5bf86 100644 --- a/specs/tests/valgrind_unit_tests.py +++ b/specs/tests/valgrind_unit_tests.py @@ -1,6 +1,6 @@ import sys, memcheck, argparse -count_ALU_tests = 832 +count_ALU_tests = 843 count_processing_tests = 277 count_token_tests = 17 From 777075f248bc6b7e6c2316c499df5cfab5c5f49b Mon Sep 17 00:00:00 2001 From: niry1 Date: Thu, 4 Jun 2026 12:24:32 +0300 Subject: [PATCH 35/50] Issue #412 - Start branch for Nodejs 24 transition --- .github/workflows/c-cpp.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/c-cpp.yml b/.github/workflows/c-cpp.yml index 95a5cc0..cdfc89c 100644 --- a/.github/workflows/c-cpp.yml +++ b/.github/workflows/c-cpp.yml @@ -2,9 +2,9 @@ name: C/C++ CI on: push: - branches: [ dev, stable, dev-0.9.9, dev-1.0.0, dev-rolling-context] + branches: [ dev, stable, dev-0.9.9, dev-1.0.0, dev-nodejs-24] pull_request: - branches: [ dev, stable, dev-0.9.9, dev-1.0.0, dev-rolling-context] + branches: [ dev, stable, dev-0.9.9, dev-1.0.0, dev-nodejs-24] env: SPECS_BRANCH: ${{ github.event.pull_request.base.ref || github.ref_name }} From 29af95aa9283d9a7972fb11c88cde8b0a3ee17ec Mon Sep 17 00:00:00 2001 From: niry1 Date: Thu, 4 Jun 2026 12:33:50 +0300 Subject: [PATCH 36/50] Issue #412 - force nodejs 24 --- .github/workflows/c-cpp.yml | 15 ++++++++------- .github/workflows/release.yml | 15 ++++++++------- 2 files changed, 16 insertions(+), 14 deletions(-) diff --git a/.github/workflows/c-cpp.yml b/.github/workflows/c-cpp.yml index cdfc89c..bb4ca50 100644 --- a/.github/workflows/c-cpp.yml +++ b/.github/workflows/c-cpp.yml @@ -8,13 +8,14 @@ on: env: SPECS_BRANCH: ${{ github.event.pull_request.base.ref || github.ref_name }} + FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true jobs: build-linux: runs-on: ubuntu-latest steps: - - uses: actions/checkout@v4 + - uses: actions/checkout@v6 with: fetch-depth: 0 @@ -26,7 +27,7 @@ jobs: echo "$PR_EVENT" - name: Set up Python 3.12 - uses: actions/setup-python@v5 + uses: actions/setup-python@v6 with: python-version: '3.12' @@ -49,12 +50,12 @@ jobs: runs-on: macos-latest steps: - - uses: actions/checkout@v4 + - uses: actions/checkout@v6 with: fetch-depth: 0 - name: Set up Python 3.12 - uses: actions/setup-python@v5 + uses: actions/setup-python@v6 with: python-version: '3.12' @@ -77,7 +78,7 @@ jobs: runs-on: windows-latest steps: - - uses: actions/checkout@v4 + - uses: actions/checkout@v6 with: fetch-depth: 0 @@ -97,12 +98,12 @@ jobs: runs-on: windows-latest steps: - - uses: actions/checkout@v4 + - uses: actions/checkout@v6 with: fetch-depth: 0 - name: Set up Python 3.12 - uses: actions/setup-python@v5 + uses: actions/setup-python@v6 with: python-version: '3.12' diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index a161edc..69c5008 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -12,6 +12,7 @@ on: env: SPECS_VERSION: ${{ github.event.release.tag_name }} SPECS_BRANCH: ${{ github.event.release.target_commitish }} + FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true jobs: build-linux: @@ -19,7 +20,7 @@ jobs: container: image: ubuntu:22.04 steps: - - uses: actions/checkout@v4 + - uses: actions/checkout@v6 with: fetch-depth: 0 @@ -103,12 +104,12 @@ jobs: build-macos: runs-on: macos-latest steps: - - uses: actions/checkout@v4 + - uses: actions/checkout@v6 with: fetch-depth: 0 - name: Set up Python 3.12 - uses: actions/setup-python@v5 + uses: actions/setup-python@v6 with: python-version: '3.12' @@ -160,7 +161,7 @@ jobs: build-windows: runs-on: windows-latest steps: - - uses: actions/checkout@v4 + - uses: actions/checkout@v6 with: fetch-depth: 0 @@ -241,12 +242,12 @@ jobs: build-windows-python: runs-on: windows-latest steps: - - uses: actions/checkout@v4 + - uses: actions/checkout@v6 with: fetch-depth: 0 - name: Set up Python 3.12 - uses: actions/setup-python@v5 + uses: actions/setup-python@v6 with: python-version: '3.12' @@ -337,7 +338,7 @@ jobs: runs-on: ${{ matrix.runner }} container: ${{ matrix.container || '' }} steps: - - uses: actions/checkout@v4 + - uses: actions/checkout@v6 with: fetch-depth: 0 From 7b042b89252264206eaf63087efc1868afc1448b Mon Sep 17 00:00:00 2001 From: niry1 Date: Thu, 4 Jun 2026 12:45:52 +0300 Subject: [PATCH 37/50] Also msbuild --- .github/workflows/c-cpp.yml | 4 ++-- .github/workflows/release.yml | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/.github/workflows/c-cpp.yml b/.github/workflows/c-cpp.yml index bb4ca50..850fd04 100644 --- a/.github/workflows/c-cpp.yml +++ b/.github/workflows/c-cpp.yml @@ -83,7 +83,7 @@ jobs: fetch-depth: 0 - name: Add MSBuild to PATH - uses: microsoft/setup-msbuild@v2 + uses: microsoft/setup-msbuild@v3 - name: Build specs (Debug) run: msbuild specs/specs.sln /p:Configuration=Debug /p:Platform=x64 @@ -108,7 +108,7 @@ jobs: python-version: '3.12' - name: Add MSBuild to PATH - uses: microsoft/setup-msbuild@v2 + uses: microsoft/setup-msbuild@v3 - name: Build specs with Python (Release) run: msbuild specs/specs.sln /p:Configuration=Release /p:Platform=x64 /p:EnablePython=true diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index 69c5008..63e7545 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -166,7 +166,7 @@ jobs: fetch-depth: 0 - name: Add MSBuild to PATH - uses: microsoft/setup-msbuild@v2 + uses: microsoft/setup-msbuild@v3 - name: Normalize Windows version metadata id: version @@ -252,7 +252,7 @@ jobs: python-version: '3.12' - name: Add MSBuild to PATH - uses: microsoft/setup-msbuild@v2 + uses: microsoft/setup-msbuild@v3 - name: Normalize Windows version metadata id: version From f69e81068971f5522b9f493c5bd3a1335442b81a Mon Sep 17 00:00:00 2001 From: niry1 Date: Thu, 4 Jun 2026 13:21:43 +0300 Subject: [PATCH 38/50] Issue #412 - update upload-artifact action to v7 --- .github/workflows/release.yml | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index 63e7545..b941da2 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -96,7 +96,7 @@ jobs: rpmbuild --define "_topdir $(pwd)/rpmbuild" -bb rpmbuild/SPECS/specs.spec - name: Upload RPM artifact - uses: actions/upload-artifact@v4 + uses: actions/upload-artifact@v7 with: name: linux-rpm path: rpmbuild/RPMS/**/*.rpm @@ -153,7 +153,7 @@ jobs: specs-${SPECS_VERSION#v}.pkg - name: Upload pkg artifact - uses: actions/upload-artifact@v4 + uses: actions/upload-artifact@v7 with: name: macos-pkg path: specs-*.pkg @@ -228,13 +228,13 @@ jobs: run: wix build -o specs-${{ steps.version.outputs.display }}.msi specs.wxs - name: Upload MSI artifact - uses: actions/upload-artifact@v4 + uses: actions/upload-artifact@v7 with: name: windows-msi path: specs-*.msi - name: Upload standalone executable artifact - uses: actions/upload-artifact@v4 + uses: actions/upload-artifact@v7 with: name: windows-exe path: specs-*-windows-x64.exe @@ -314,13 +314,13 @@ jobs: run: wix build -o specs-${{ steps.version.outputs.display }}-python312.msi specs-python.wxs - name: Upload MSI artifact - uses: actions/upload-artifact@v4 + uses: actions/upload-artifact@v7 with: name: windows-msi-python path: specs-*-python312.msi - name: Upload standalone executable artifact - uses: actions/upload-artifact@v4 + uses: actions/upload-artifact@v7 with: name: windows-exe-python path: specs-*-python312-windows-x64.exe @@ -410,7 +410,7 @@ jobs: dpkg-deb --build deb-root specs_${SPECS_VERSION#v}_${{ matrix.arch }}.deb - name: Upload DEB artifact - uses: actions/upload-artifact@v4 + uses: actions/upload-artifact@v7 with: name: linux-deb-${{ matrix.arch }} path: specs_*.deb From 7fccb6536f145e25483eb6c81229af4b04c35656 Mon Sep 17 00:00:00 2001 From: niry1 Date: Thu, 4 Jun 2026 13:33:23 +0300 Subject: [PATCH 39/50] Issue #412 - two more --- .github/workflows/release.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index b941da2..5111d49 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -422,7 +422,7 @@ jobs: contents: write steps: - name: Download all artifacts - uses: actions/download-artifact@v4 + uses: actions/download-artifact@v7 with: path: artifacts @@ -430,7 +430,7 @@ jobs: run: find artifacts -type f - name: Upload release assets - uses: softprops/action-gh-release@v2 + uses: softprops/action-gh-release@v3 with: files: | artifacts/linux-rpm/**/*.rpm From b74fcb1f1d887b281414e8116e0e9c989f53e203 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Thu, 4 Jun 2026 14:54:50 +0300 Subject: [PATCH 40/50] Issue #417 - segmentation fault in substitute (#420) --- specs/src/test/ALUUnitTest.cc | 1 + specs/src/utils/aluFunctions.cc | 3 +-- specs/tests/valgrind_unit_tests.py | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/specs/src/test/ALUUnitTest.cc b/specs/src/test/ALUUnitTest.cc index 53023e7..8ed7c94 100644 --- a/specs/src/test/ALUUnitTest.cc +++ b/specs/src/test/ALUUnitTest.cc @@ -1137,6 +1137,7 @@ int runALUUnitTests11(unsigned int onlyTest) VERIFY_EXPR_RES("substitute('Just the place for a snark',' ','','u')", "Just the place for a snark"); VERIFY_EXPR_RES("substitute('Just the place for a snark',' ','','U')", "Justtheplaceforasnark"); VERIFY_EXPR_RES("substitute('Just the place for a snark',' ','_','U')", "Just_the_place_for_a_snark"); + VERIFY_EXPR_RES("substitute('Just the place for a snark',' ','_')", "Just_the place for a snark"); VERIFY_EXPR_RES("sfield('Where hae\tya been',0,'')","sfield: Called with count equal to zero"); VERIFY_EXPR_RES("sfield('Where hae\tya been',1,'')","Where hae"); diff --git a/specs/src/utils/aluFunctions.cc b/specs/src/utils/aluFunctions.cc index 0eb302c..9fdabed 100644 --- a/specs/src/utils/aluFunctions.cc +++ b/specs/src/utils/aluFunctions.cc @@ -1689,8 +1689,7 @@ PValue AluFunc_substitute(PValue pSrc, PValue pSearchString, PValue pSubstitute, ASSERT_NOT_ELIDED(pSearchString,2,needle); ASSERT_NOT_ELIDED(pSubstitute,3,subst); std::string res = pSrc->getStr(); - ALUInt count = ARG_INT_WITH_DEFAULT(pMax,1); - if (pMax->getStr()=="U") count = MAX_ALUInt; + ALUInt count = (ARG_STR_WITH_DEFAULT(pMax,"") == "U") ? MAX_ALUInt : ARG_INT_WITH_DEFAULT(pMax,1); size_t findRet = 0; diff --git a/specs/tests/valgrind_unit_tests.py b/specs/tests/valgrind_unit_tests.py index 7d5bf86..2c780aa 100644 --- a/specs/tests/valgrind_unit_tests.py +++ b/specs/tests/valgrind_unit_tests.py @@ -1,6 +1,6 @@ import sys, memcheck, argparse -count_ALU_tests = 843 +count_ALU_tests = 844 count_processing_tests = 277 count_token_tests = 17 From df21378b3f04d6aa4cf9665c8d71ef96f9c55280 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Mon, 8 Jun 2026 16:31:38 +0300 Subject: [PATCH 41/50] Issue #424 - add build info string literals (#425) --- .github/workflows/c-cpp.yml | 9 ++-- .github/workflows/release.yml | 33 +++++++++++--- .gitignore | 3 ++ manpage | 35 ++++++++++++++- specs/docs/alu.md | 12 +++++ specs/docs/onepage.md | 6 +++ specs/src/ALUUnitTest.vcxproj | 1 + specs/src/ProcessingTest.vcxproj | 1 + specs/src/TokenTest.vcxproj | 1 + specs/src/build_info.targets | 47 ++++++++++++++++++++ specs/src/cacheTest.vcxproj | 1 + specs/src/cli/tokens.cc | 2 +- specs/src/generate_build_info.py | 76 ++++++++++++++++++++++++++++++++ specs/src/itemTest.vcxproj | 1 + specs/src/processing/Config.cc | 37 ++++++++++++++++ specs/src/processing/Config.h | 2 + specs/src/readWriteTest.vcxproj | 1 + specs/src/setup.py | 15 ++++++- specs/src/specs.vcxproj | 1 + specs/src/timeTest.vcxproj | 1 + 20 files changed, 271 insertions(+), 14 deletions(-) create mode 100644 specs/src/build_info.targets create mode 100644 specs/src/generate_build_info.py diff --git a/.github/workflows/c-cpp.yml b/.github/workflows/c-cpp.yml index 95a5cc0..005db0c 100644 --- a/.github/workflows/c-cpp.yml +++ b/.github/workflows/c-cpp.yml @@ -8,6 +8,7 @@ on: env: SPECS_BRANCH: ${{ github.event.pull_request.base.ref || github.ref_name }} + SPECS_BUILD_SOURCE: github jobs: build-linux: @@ -39,7 +40,7 @@ jobs: run: make all - name: Test specs executable - run: specs/exe/specs "@version" WRITE "@platform" + run: specs/exe/specs "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" - name: make check working-directory: specs/src @@ -67,7 +68,7 @@ jobs: run: make all - name: Test specs executable - run: specs/exe/specs "@version" WRITE "@platform" + run: specs/exe/specs "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" - name: make check working-directory: specs/src @@ -91,7 +92,7 @@ jobs: run: msbuild specs/specs.sln /p:Configuration=Release /p:Platform=x64 - name: Test specs executable - run: specs/bin/Release/specs.exe "@version" WRITE "@platform" + run: specs/bin/Release/specs.exe "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" build-windows-python: runs-on: windows-latest @@ -113,4 +114,4 @@ jobs: run: msbuild specs/specs.sln /p:Configuration=Release /p:Platform=x64 /p:EnablePython=true - name: Test specs executable - run: specs/bin/Release/specs.exe "@version" WRITE "@platform" + run: specs/bin/Release/specs.exe "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index a161edc..f8697d4 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -12,6 +12,8 @@ on: env: SPECS_VERSION: ${{ github.event.release.tag_name }} SPECS_BRANCH: ${{ github.event.release.target_commitish }} + SPECS_BUILD_SOURCE: github + SPECS_BUILD_NUMBER: ${{ github.run_number }} jobs: build-linux: @@ -19,10 +21,18 @@ jobs: container: image: ubuntu:22.04 steps: + - name: Install git (so checkout creates a real repository in the container) + run: | + DEBIAN_FRONTEND=noninteractive apt-get update + DEBIAN_FRONTEND=noninteractive apt-get install -y git + - uses: actions/checkout@v4 with: fetch-depth: 0 + - name: Mark workspace as safe for git + run: git config --global --add safe.directory "$GITHUB_WORKSPACE" + - name: Dump event release payload env: EVENT: ${{ toJSON(github.event.release) }} @@ -46,7 +56,7 @@ jobs: run: make some - name: Verify binary - run: specs/exe/specs "@version" WRITE "@platform" + run: specs/exe/specs "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" - name: Prepare manpage run: | @@ -121,7 +131,7 @@ jobs: run: make some - name: Verify binary - run: specs/exe/specs "@version" WRITE "@platform" + run: specs/exe/specs "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" - name: Prepare manpage run: | @@ -205,7 +215,7 @@ jobs: run: msbuild specs/specs.sln /p:Configuration=Release /p:Platform=x64 /p:GitTag=${{ steps.version.outputs.display }} - name: Verify binary - run: specs\bin\Release\specs.exe "@version" WRITE "@platform" + run: specs\bin\Release\specs.exe "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" - name: Prepare standalone executable shell: bash @@ -291,7 +301,7 @@ jobs: run: msbuild specs/specs.sln /p:Configuration=Release /p:Platform=x64 /p:GitTag=${{ steps.version.outputs.display }} /p:EnablePython=true - name: Verify binary - run: specs\bin\Release\specs.exe "@version" WRITE "@platform" + run: specs\bin\Release\specs.exe "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" - name: Prepare standalone executable shell: bash @@ -337,10 +347,23 @@ jobs: runs-on: ${{ matrix.runner }} container: ${{ matrix.container || '' }} steps: + - name: Install git (so checkout creates a real repository in the container) + run: | + if [ "$(id -u)" -eq 0 ]; then + APT="apt-get" + else + APT="sudo apt-get" + fi + DEBIAN_FRONTEND=noninteractive $APT update + DEBIAN_FRONTEND=noninteractive $APT install -y git + - uses: actions/checkout@v4 with: fetch-depth: 0 + - name: Mark workspace as safe for git + run: git config --global --add safe.directory "$GITHUB_WORKSPACE" + - name: Install build tools and Python 3.12 run: | if [ "$(id -u)" -eq 0 ]; then @@ -366,7 +389,7 @@ jobs: run: make some - name: Verify binary - run: specs/exe/specs "@version" WRITE "@platform" + run: specs/exe/specs "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" - name: Prepare manpage run: | diff --git a/.gitignore b/.gitignore index fa0913d..3ae0f58 100644 --- a/.gitignore +++ b/.gitignore @@ -48,3 +48,6 @@ specs/src/Release/ # GDB-related specs/src/.gdbinit specs/src/gdb/__pycache__ + +# Generated build info +specs/src/utils/build_info.h diff --git a/manpage b/manpage index 3d97907..3807d70 100644 --- a/manpage +++ b/manpage @@ -2272,11 +2272,11 @@ The system also defines some labels: .IP "version" 3 This returns the version of .B specs, -for example "v0.6" +for example "1.0.0" .P To find out the version of specs that you are using, use the following command: .P - specs @version 1 + specs @version .IP "cols" 3 contains the number of columns in the display. You can override this in the configuration file. For example, the following prints a right-justified string. .P @@ -2293,6 +2293,37 @@ depending on whether support for python function is available. contains a string describing this build. For example: .p POSIX (darwin) system using the g++ compiler and Python 3.9.6 - release variation +.IP "build-commit" 3 +contains the git commit hash (short form) of the build. For example: +.p + b74fcb1 +.IP "build-branch" 3 +contains the git branch name of the build. May be empty if not available. For example: +.p + dev-1.0.0 +.IP "build-time" 3 +contains the UTC timestamp of when the build was created. Format is +.B yyyy-MM-ddTHH:mm:ss. +For example: +.p + 2026-06-08T08:22:52 +.IP "build-source" 3 +contains either +.B local +or +.B github +depending on where the build was created. +.IP "build-number" 3 +contains the build number from GitHub Actions (github.run_number). Empty for local builds. +.IP "build-info" 3 +contains a composite string with all build information. For example: +.p + Built locally from commit 8bd11da on branch dev-1.0.0 at 2026-06-08T13:01:18 +.p +or +.p + Built on github (build 217) from commit 8bd11da of version 1.0.0 at 2026-06-08T12:40:01 +.p .SH EXAMPLES `ls -l` yields this: diff --git a/specs/docs/alu.md b/specs/docs/alu.md index 222a985..ca3c1c8 100644 --- a/specs/docs/alu.md +++ b/specs/docs/alu.md @@ -117,6 +117,18 @@ POSIX (darwin) system using the g++ compiler and Python 3.9.6 - release variatio ``` Others are `@cols`, which contains the number of columns in the terminal screen, and `@rows`, which contains the number of rows on that same screen. +Build information is also available via the following labels: +- `@build-commit` — the git commit hash (short form) of the build +- `@build-branch` — the git branch name (may be empty) +- `@build-time` — the UTC timestamp when the build was created (format: `yyyy-MM-ddTHH:mm:ss`) +- `@build-source` — either `local` or `github` +- `@build-number` — the GitHub Actions build number (empty for local builds) +- `@build-info` — a composite string with all build information, e.g.: + ``` + Built locally from commit 8bd11da on branch dev-1.0.0 at 2026-06-08T13:01:18 + Built on github (build 217) from commit 8bd11da of version 1.0.0 at 2026-06-08T12:40:01 + ``` + Additionally, the `@@` string stands for the entire input record. When rolling context is in effect (see [Streams and Records](streams.md#rolling-context)), `@@` always refers to the original input record. The `@!` string refers to the current record as affected by `CONTEXT`, which is the same as `@@` when no `CONTEXT` is active. The `@-n` and `@+n` syntax is an alternative to using that is effective within expressions. Note that reading beyond the input with `@+n` or `@-n` does not cause processing to stop, even if a `READSTOP` token is present in the specification. The following three specifications are equivalent: ``` diff --git a/specs/docs/onepage.md b/specs/docs/onepage.md index 0ee0ae1..3ac2289 100644 --- a/specs/docs/onepage.md +++ b/specs/docs/onepage.md @@ -200,6 +200,12 @@ There are some pre-configured labels that do not need to be explicitly defined: * platform - contains a string with the OS type, the compiler and the variation used to build *specs* * cols - contains the number of screen columns - useful for composed output placement. * rows - contains the number of screen rows. +* build-commit - contains the git commit hash (short form) of the build +* build-branch - contains the git branch name (may be empty) +* build-time - contains the UTC timestamp when the build was created (format: `yyyy-MM-ddTHH:mm:ss`) +* build-source - contains either `local` or `github` +* build-number - contains the GitHub Actions build number (empty for local builds) +* build-info - contains a composite string with all build information Examples ======== diff --git a/specs/src/ALUUnitTest.vcxproj b/specs/src/ALUUnitTest.vcxproj index a824806..80cfad0 100644 --- a/specs/src/ALUUnitTest.vcxproj +++ b/specs/src/ALUUnitTest.vcxproj @@ -31,6 +31,7 @@ true + diff --git a/specs/src/ProcessingTest.vcxproj b/specs/src/ProcessingTest.vcxproj index 3009b73..05a11b9 100644 --- a/specs/src/ProcessingTest.vcxproj +++ b/specs/src/ProcessingTest.vcxproj @@ -31,6 +31,7 @@ true + diff --git a/specs/src/TokenTest.vcxproj b/specs/src/TokenTest.vcxproj index 7fa3e9c..e60ce18 100644 --- a/specs/src/TokenTest.vcxproj +++ b/specs/src/TokenTest.vcxproj @@ -31,6 +31,7 @@ true + diff --git a/specs/src/build_info.targets b/specs/src/build_info.targets new file mode 100644 index 0000000..7e1e622 --- /dev/null +++ b/specs/src/build_info.targets @@ -0,0 +1,47 @@ + + + + + + + $([System.DateTime]::UtcNow.ToString("yyyy-MM-ddTHH:mm:ss")) + local + $(SPECS_BUILD_SOURCE) + $(SPECS_BUILD_NUMBER) + + + + + + + + + + + + + + + $(BuildCommit.Trim()) + $(BuildBranch.Trim()) + + + + + $(SPECS_BRANCH) + + + + + + + + + + + + + + + + diff --git a/specs/src/cacheTest.vcxproj b/specs/src/cacheTest.vcxproj index 4709e89..a57f66c 100644 --- a/specs/src/cacheTest.vcxproj +++ b/specs/src/cacheTest.vcxproj @@ -31,6 +31,7 @@ true + diff --git a/specs/src/cli/tokens.cc b/specs/src/cli/tokens.cc index c627e1a..405b951 100644 --- a/specs/src/cli/tokens.cc +++ b/specs/src/cli/tokens.cc @@ -493,7 +493,7 @@ void parseSingleToken(std::vector *pVec, std::string arg, int argidx) /* Check for a configuration literal */ std::string key = arg.substr(1); - if ((arg[0]=='@') && (arg.length() > 1) && (configSpecLiteralExists(key))) { + if ((arg[0]=='@') && (arg.length() > 1) && (configSpecLiteralDefined(key))) { std::string literal = configSpecLiteralGet(key); pVec->insert(pVec->end(), Token(TokenListType__LITERAL, nullptr /* range */, diff --git a/specs/src/generate_build_info.py b/specs/src/generate_build_info.py new file mode 100644 index 0000000..567d396 --- /dev/null +++ b/specs/src/generate_build_info.py @@ -0,0 +1,76 @@ +#!/usr/bin/env python3 +"""Generate build_info.h with current build information.""" + +import datetime +import os +import subprocess +import sys + + +def report_success(name, value): + print('Setting {} to "{}"'.format(name, value)) + + +def report_failure(name, exc): + sys.stderr.write("Failed to determine {}: {}\n".format(name, repr(exc))) + # If the failure came from a subprocess, surface the command's stderr too, + # since that usually explains *why* (e.g. git's "dubious ownership" error). + output = getattr(exc, "output", None) + if output: + if isinstance(output, bytes): + output = output.decode(errors="replace") + sys.stderr.write(" stdout: {}\n".format(output.strip())) + stderr = getattr(exc, "stderr", None) + if stderr: + if isinstance(stderr, bytes): + stderr = stderr.decode(errors="replace") + sys.stderr.write(" stderr: {}\n".format(stderr.strip())) + + +def run_git(name, args): + """Run a git command, capturing stderr so failures can be reported.""" + try: + value = subprocess.check_output( + ['git'] + args, + stderr=subprocess.PIPE + ).decode().strip() + if value: + report_success(name, value) + return value + except Exception as exc: + report_failure(name, exc) + return "" + + +# Get commit hash +build_commit = run_git("SPECS_BUILD_COMMIT", ['rev-parse', '--short', 'HEAD']) + +# Get branch name +build_branch = run_git("SPECS_BUILD_BRANCH", ['branch', '--show-current']) + +# Fall back to the SPECS_BRANCH environment variable (set by release.yml) +# since `git branch --show-current` is empty on a detached HEAD / tag checkout. +if build_branch == "": + build_branch = os.environ.get("SPECS_BRANCH", "") + if build_branch: + report_success("SPECS_BUILD_BRANCH (from SPECS_BRANCH env)", build_branch) + +# Get UTC build time +build_time = datetime.datetime.now(datetime.timezone.utc).strftime("%Y-%m-%dT%H:%M:%S") +report_success("SPECS_BUILD_TIME", build_time) + +# Get build source and number from environment +build_source = os.environ.get("SPECS_BUILD_SOURCE", "local") +report_success("SPECS_BUILD_SOURCE", build_source) +build_number = os.environ.get("SPECS_BUILD_NUMBER", "") +report_success("SPECS_BUILD_NUMBER", build_number) + +# Write the header file +with open("utils/build_info.h", "w") as f: + f.write('#define SPECS_BUILD_COMMIT "{}"\n'.format(build_commit)) + f.write('#define SPECS_BUILD_BRANCH "{}"\n'.format(build_branch)) + f.write('#define SPECS_BUILD_TIME "{}"\n'.format(build_time)) + f.write('#define SPECS_BUILD_SOURCE "{}"\n'.format(build_source)) + f.write('#define SPECS_BUILD_NUMBER "{}"\n'.format(build_number)) + +print("Generated utils/build_info.h") diff --git a/specs/src/itemTest.vcxproj b/specs/src/itemTest.vcxproj index e6234ea..228b433 100644 --- a/specs/src/itemTest.vcxproj +++ b/specs/src/itemTest.vcxproj @@ -31,6 +31,7 @@ true + diff --git a/specs/src/processing/Config.cc b/specs/src/processing/Config.cc index 8a9e5a7..a08d9f4 100644 --- a/specs/src/processing/Config.cc +++ b/specs/src/processing/Config.cc @@ -19,6 +19,7 @@ #include "utils/PythonIntf.h" #include "utils/aluRegex.h" #include "utils/aluFunctions.h" +#include "utils/build_info.h" #include "Config.h" #define STRINGIFY2(x) #x @@ -212,6 +213,36 @@ void readConfigurationFile() if (0==ExternalLiterals.count("rows")) { ExternalLiterals["rows"] = getTerminalRowsAndColumns(true); } + + // Build information + ExternalLiterals["build-commit"] = dequote(STRINGIFY(SPECS_BUILD_COMMIT)); + ExternalLiterals["build-branch"] = dequote(STRINGIFY(SPECS_BUILD_BRANCH)); + ExternalLiterals["build-time"] = dequote(STRINGIFY(SPECS_BUILD_TIME)); + ExternalLiterals["build-source"] = dequote(STRINGIFY(SPECS_BUILD_SOURCE)); + ExternalLiterals["build-number"] = dequote(STRINGIFY(SPECS_BUILD_NUMBER)); + + // Compose build-info + std::string build_info = "Built "; + if (ExternalLiterals["build-source"] == "github") { + build_info += "on github"; + if (!ExternalLiterals["build-number"].empty()) { + build_info += " (build " + ExternalLiterals["build-number"] + ")"; + } + } else { + build_info += "locally"; + } + if (!ExternalLiterals["build-commit"].empty()) { + build_info += " from commit " + ExternalLiterals["build-commit"]; + if (!ExternalLiterals["build-branch"].empty()) { + if (ExternalLiterals["build-branch"]=="dev" || ExternalLiterals["build-branch"]=="stable") { + build_info += " of version " + ExternalLiterals["version"]; + } else { + build_info += " on branch " + ExternalLiterals["build-branch"]; + } + } + } + build_info += " at " + ExternalLiterals["build-time"]; + ExternalLiterals["build-info"] = build_info; } bool configSpecLiteralExists(std::string& key) @@ -220,6 +251,12 @@ bool configSpecLiteralExists(std::string& key) return it != ExternalLiterals.end() && !it->second.empty(); } +bool configSpecLiteralDefined(std::string& key) +{ + auto it = ExternalLiterals.find(key); + return it != ExternalLiterals.end(); +} + std::string& configSpecLiteralGet(std::string& key) { return ExternalLiterals[key]; diff --git a/specs/src/processing/Config.h b/specs/src/processing/Config.h index 34f7afb..85da620 100644 --- a/specs/src/processing/Config.h +++ b/specs/src/processing/Config.h @@ -65,6 +65,8 @@ void readConfigurationFile(); bool configSpecLiteralExists(std::string& key); +bool configSpecLiteralDefined(std::string& key); + std::string& configSpecLiteralGet(std::string& key); std::string& configSpecLiteralGetWithDefault(std::string& key, std::string& _default); diff --git a/specs/src/readWriteTest.vcxproj b/specs/src/readWriteTest.vcxproj index e13b871..13e3c4c 100644 --- a/specs/src/readWriteTest.vcxproj +++ b/specs/src/readWriteTest.vcxproj @@ -31,6 +31,7 @@ true + diff --git a/specs/src/setup.py b/specs/src/setup.py index 3cac026..8323792 100644 --- a/specs/src/setup.py +++ b/specs/src/setup.py @@ -186,10 +186,12 @@ def python_search(arg): LIBOBJS = $(CCSRC:.cc=.{}) TESTOBJS = $(TESTSRC:.cc=.{}) +BUILD_INFO = utils/build_info.h + #default goal -some: directories $(EXE_DIR)/specs $(EXE_DIR)/specs-autocomplete +some: directories $(BUILD_INFO) $(EXE_DIR)/specs $(EXE_DIR)/specs-autocomplete -all: directories $(TEST_EXES) +all: directories $(BUILD_INFO) $(TEST_EXES) %.obj : %.cc $(CXX) $(CPPFLAGS) /Fo$@ /c $< @@ -212,6 +214,11 @@ def python_search(arg): body2 = \ """ +.PHONY: utils/build_info.h + +utils/build_info.h: + @python3 generate_build_info.py + run_tests: $(TEST_EXES) $(EXE_DIR)/TokenTest $(EXE_DIR)/ProcessingTest @@ -762,6 +769,10 @@ def python_search(arg): "python3 $(TESTS_DIR)/recfm_tests.py\n\tpython3 $(TESTS_DIR)/pytest.py" ) +# Generate build_info.h (so it exists before the first compile; it is +# regenerated on every build by the utils/build_info.h Makefile target) +subprocess.call([sys.executable, "generate_build_info.py"]) + with open("Makefile", "w") as makefile: makefile.write("CXX={}\n".format(cxx)) makefile.write("LINKER={}\n".format("link.exe" if (compiler=="VS") else cxx)) diff --git a/specs/src/specs.vcxproj b/specs/src/specs.vcxproj index 33a40d0..a814532 100644 --- a/specs/src/specs.vcxproj +++ b/specs/src/specs.vcxproj @@ -31,6 +31,7 @@ true + diff --git a/specs/src/timeTest.vcxproj b/specs/src/timeTest.vcxproj index 3c8f77f..c24bd88 100644 --- a/specs/src/timeTest.vcxproj +++ b/specs/src/timeTest.vcxproj @@ -31,6 +31,7 @@ true + From a35b3b1d15653fe16f70427b945e58397c2aa42f Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Tue, 9 Jun 2026 18:37:33 +0300 Subject: [PATCH 42/50] Issue #427 - Prevent build_info regen on make run_tests and install (#428) --- specs/src/setup.py | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/specs/src/setup.py b/specs/src/setup.py index 8323792..5f6977e 100644 --- a/specs/src/setup.py +++ b/specs/src/setup.py @@ -186,12 +186,18 @@ def python_search(arg): LIBOBJS = $(CCSRC:.cc=.{}) TESTOBJS = $(TESTSRC:.cc=.{}) -BUILD_INFO = utils/build_info.h +# build_info.h is regenerated only when one of the core objects would be +# rebuilt (Config.o is excluded to avoid a cycle, since Config.cc includes +# build_info.h). This keeps an up-to-date tree a no-op instead of forcing a +# spurious header regeneration, Config.o recompile and relink on every build. +BUILD_INFO_DEPS = $(filter-out processing/Config.o processing/Config.obj,$(LIBOBJS)) #default goal -some: directories $(BUILD_INFO) $(EXE_DIR)/specs $(EXE_DIR)/specs-autocomplete +some: directories $(EXE_DIR)/specs $(EXE_DIR)/specs-autocomplete -all: directories $(BUILD_INFO) $(TEST_EXES) +all: directories $(TEST_EXES) + +specs: directories $(EXE_DIR)/specs %.obj : %.cc $(CXX) $(CPPFLAGS) /Fo$@ /c $< @@ -214,9 +220,9 @@ def python_search(arg): body2 = \ """ -.PHONY: utils/build_info.h +.PHONY: specs -utils/build_info.h: +utils/build_info.h: $(BUILD_INFO_DEPS) @python3 generate_build_info.py run_tests: $(TEST_EXES) From 91738da050a427bc9aa0313413c20178f77aeecd Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Wed, 10 Jun 2026 22:14:12 +0300 Subject: [PATCH 43/50] Issue #429 - autocomplete defined labels (#430) --- manpage | 141 ++++++++++----------------- specs/docs/alu.md | 6 +- specs/docs/onepage.md | 2 +- specs/src/generate_build_info.py | 11 ++- specs/src/processing/Config.cc | 10 +- specs/src/processing/Config.h | 3 +- specs/src/test/specs-autocomplete.cc | 70 +++++++++++++ 7 files changed, 143 insertions(+), 100 deletions(-) diff --git a/manpage b/manpage index 3807d70..9dd9849 100644 --- a/manpage +++ b/manpage @@ -2302,9 +2302,9 @@ contains the git branch name of the build. May be empty if not available. For ex .p dev-1.0.0 .IP "build-time" 3 -contains the UTC timestamp of when the build was created. Format is +contains the timestamp of when the build was created. Format is .B yyyy-MM-ddTHH:mm:ss. -For example: +It has local time for local builds, and UTC for GitHub builds. For example: .p 2026-06-08T08:22:52 .IP "build-source" 3 @@ -2318,113 +2318,78 @@ contains the build number from GitHub Actions (github.run_number). Empty for loc .IP "build-info" 3 contains a composite string with all build information. For example: .p - Built locally from commit 8bd11da on branch dev-1.0.0 at 2026-06-08T13:01:18 + Built locally from commit 8bd11da on branch dev-1.0.0 at 2026-06-08T13:01:18 local .p or .p - Built on github (build 217) from commit 8bd11da of version 1.0.0 at 2026-06-08T12:40:01 + Built on github (build 217) from commit 8bd11da of version 1.0.0 at 2026-06-08T10:01:18 UTC .p .SH EXAMPLES `ls -l` yields this: - total 352 - -rw-r--r--@ 1 ynir admin 574 Aug 25 2009 Makefile - -rw-r--r--@ 1 ynir admin 3542 Nov 23 00:21 README - -rw-r--r--@ 1 ynir admin 362 Nov 19 08:31 conversion.h - -rw-r--r-- 1 ynir admin 984 Nov 11 17:45 ls.txt - -rw-r--r--@ 1 ynir admin 2233 Nov 23 00:03 main.cc - -rw-r--r-- 1 ynir admin 9412 Nov 23 00:11 main.o - -rw-r--r--@ 1 ynir admin 6567 Nov 23 00:09 spec_build.cc - -rw-r--r-- 1 ynir admin 16776 Nov 23 00:11 spec_build.o - -rw-r--r--@ 1 ynir admin 5494 Nov 19 08:30 spec_convert.cc - -rw-r--r-- 1 ynir admin 17004 Nov 23 00:11 spec_convert.o - -rw-r--r--@ 1 ynir admin 11419 Nov 23 00:10 spec_params.cc - -rw-r--r-- 1 ynir admin 21080 Nov 23 00:11 spec_params.o - -rw-r--r--@ 1 ynir admin 375 Nov 11 09:29 spec_vars.cc - -rw-r--r-- 1 ynir admin 4800 Nov 23 00:11 spec_vars.o - -rwxr-xr-x 1 ynir admin 36740 Nov 23 00:11 specs - -rw-r--r--@ 1 ynir admin 1547 Nov 23 00:10 specs.h + total 96 + -rw-rw-r-- 1 sio sio 16432 Jun 1 12:49 dataField.cc + -rw-rw-r-- 1 sio sio 6187 Jun 4 12:16 InputPart.cc + -rw-rw-r-- 1 sio sio 11717 Jun 4 12:16 item.h + -rw-rw-r-- 1 sio sio 35974 Jun 4 12:16 specItems.cc + -rw-rw-r-- 1 sio sio 1144 Jun 4 12:16 specItems.h + -rw-rw-r-- 1 sio sio 13953 Jun 1 12:49 splitItem.cc Let's run it though a spec: - ls -l | specs 12-* 1 redo w2 1 w4 d2x 8.8 r w8 17 + ls -l | specs 12-* 1 REDO IF "wordcount()>=8" THEN w2 1 w4 d2x 8.8 RIGHT w8 17 The first spec unit converts it to this: - 1 ynir admin 574 Aug 25 2009 Makefile - 1 ynir admin 3542 Nov 23 00:21 README - 1 ynir admin 362 Nov 19 08:31 conversion.h - 1 ynir admin 984 Nov 11 17:45 ls.txt - 1 ynir admin 2233 Nov 23 00:03 main.cc - 1 ynir admin 9412 Nov 23 00:11 main.o - 1 ynir admin 6567 Nov 23 00:09 spec_build.cc - 1 ynir admin 16776 Nov 23 00:11 spec_build.o - 1 ynir admin 5494 Nov 19 08:30 spec_convert.cc - 1 ynir admin 17004 Nov 23 00:11 spec_convert.o - 1 ynir admin 11419 Nov 23 00:10 spec_params.cc - 1 ynir admin 21080 Nov 23 00:11 spec_params.o - 1 ynir admin 375 Nov 11 09:29 spec_vars.cc - 1 ynir admin 4800 Nov 23 00:11 spec_vars.o - 1 ynir admin 36740 Nov 23 00:11 specs - 1 ynir admin 1547 Nov 23 00:10 specs.h - -Then after the redo, we get this: - - ynir 23e Makefile - ynir dd6 README - ynir 16a conversion.h - ynir 3d8 ls.txt - ynir 8b9 main.cc - ynir 24c4 main.o - ynir 19a7 spec_build.cc - ynir 4188 spec_build.o - ynir 1576 spec_convert.cc - ynir 426c spec_convert.o - ynir 2c9b spec_params.cc - ynir 5258 spec_params.o - ynir eae spec_vars.cc - ynir 12c0 spec_vars.o - ynir 8f84 specs - ynir 60b specs.h - + 1 sio sio 16432 Jun 1 12:49 dataField.cc + 1 sio sio 6187 Jun 4 12:16 InputPart.cc + 1 sio sio 11717 Jun 4 12:16 item.h + 1 sio sio 35974 Jun 4 12:16 specItems.cc + 1 sio sio 1144 Jun 4 12:16 specItems.h + 1 sio sio 13953 Jun 1 12:49 splitItem.cc + +Then after the REDO arm, we get this: + + sio 4030 dataField.cc + sio 182b InputPart.cc + sio 2dc5 item.h + sio 8c86 specItems.cc + sio 478 specItems.h + sio 3681 splitItem.cc Alternatively, let's arrange this on multiple lines: - ls -l | specs w9 1 write "Owner:" 3 w3 10 write "Size:" 3 w5 10-20 r - - Makefile - Owner: ynir - Size: 574 - README - Owner: ynir - Size: 5834 - conversion.h - Owner: ynir - Size: 362 - list.txt - Owner: ynir - Size: 978 - ls.txt - Owner: ynir - Size: 984 - main.cc - Owner: ynir - Size: 2233 - main.o - Owner: ynir - Size: 9412 + specs -C "ls -l" w9 1 write "Owner:" 3 w3 10 write "Size:" 3 w5 10-20 r + + Owner: + Size: r + dataField.cc + Owner: sio + Size: 16432 r + InputPart.cc + Owner: sio + Size: 6187 r + item.h + Owner: sio + Size: 11717 r + specItems.cc + Owner: sio + Size: 35974 r + specItems.h + Owner: sio + Size: 1144 r + splitItem.cc + Owner: sio + Size: 13953 r Finally, let's make our own version of the multi-column display: - ls -l | specs w9 1 read w9 26 read w9 51 - Makefile README - conversion.h main.cc main.o - spec_build.cc spec_build.o spec_convert.cc - spec_convert.o spec_params.cc spec_params.o - spec_vars.cc spec_vars.o specs - specs.h + specs -C "ls -l" w9 1 read w9 26 read w9 51 + dataField.cc InputPart.cc + item.h specItems.cc specItems.h + splitItem.cc .SH SEE ALSO sed(1), awk(1) diff --git a/specs/docs/alu.md b/specs/docs/alu.md index ca3c1c8..373a9fa 100644 --- a/specs/docs/alu.md +++ b/specs/docs/alu.md @@ -120,13 +120,13 @@ Others are `@cols`, which contains the number of columns in the terminal screen, Build information is also available via the following labels: - `@build-commit` — the git commit hash (short form) of the build - `@build-branch` — the git branch name (may be empty) -- `@build-time` — the UTC timestamp when the build was created (format: `yyyy-MM-ddTHH:mm:ss`) +- `@build-time` — the timestamp when the build was created (format: `yyyy-MM-ddTHH:mm:ss`). It's local time for local builds, or UTC for GitHub builds. - `@build-source` — either `local` or `github` - `@build-number` — the GitHub Actions build number (empty for local builds) - `@build-info` — a composite string with all build information, e.g.: ``` - Built locally from commit 8bd11da on branch dev-1.0.0 at 2026-06-08T13:01:18 - Built on github (build 217) from commit 8bd11da of version 1.0.0 at 2026-06-08T12:40:01 + Built locally from commit 8bd11da on branch dev-1.0.0 at 2026-06-08T13:01:18 local + Built on github (build 217) from commit 8bd11da of version 1.0.0 at 2026-06-08T10:01:18 UTC ``` Additionally, the `@@` string stands for the entire input record. When rolling context is in effect (see [Streams and Records](streams.md#rolling-context)), `@@` always refers to the original input record. The `@!` string refers to the current record as affected by `CONTEXT`, which is the same as `@@` when no `CONTEXT` is active. The `@-n` and `@+n` syntax is an alternative to using that is effective within expressions. Note that reading beyond the input with `@+n` or `@-n` does not cause processing to stop, even if a `READSTOP` token is present in the specification. The following three specifications are equivalent: diff --git a/specs/docs/onepage.md b/specs/docs/onepage.md index 3ac2289..1a4c903 100644 --- a/specs/docs/onepage.md +++ b/specs/docs/onepage.md @@ -202,7 +202,7 @@ There are some pre-configured labels that do not need to be explicitly defined: * rows - contains the number of screen rows. * build-commit - contains the git commit hash (short form) of the build * build-branch - contains the git branch name (may be empty) -* build-time - contains the UTC timestamp when the build was created (format: `yyyy-MM-ddTHH:mm:ss`) +* build-time - contains the timestamp when the build was created (format: `yyyy-MM-ddTHH:mm:ss`). It's local time for local builds, or UTC for GitHub builds. * build-source - contains either `local` or `github` * build-number - contains the GitHub Actions build number (empty for local builds) * build-info - contains a composite string with all build information diff --git a/specs/src/generate_build_info.py b/specs/src/generate_build_info.py index 567d396..32a2ade 100644 --- a/specs/src/generate_build_info.py +++ b/specs/src/generate_build_info.py @@ -55,16 +55,19 @@ def run_git(name, args): if build_branch: report_success("SPECS_BUILD_BRANCH (from SPECS_BRANCH env)", build_branch) -# Get UTC build time -build_time = datetime.datetime.now(datetime.timezone.utc).strftime("%Y-%m-%dT%H:%M:%S") -report_success("SPECS_BUILD_TIME", build_time) - # Get build source and number from environment build_source = os.environ.get("SPECS_BUILD_SOURCE", "local") report_success("SPECS_BUILD_SOURCE", build_source) build_number = os.environ.get("SPECS_BUILD_NUMBER", "") report_success("SPECS_BUILD_NUMBER", build_number) +# Get UTC build time +if build_source == "local": + build_time = datetime.datetime.now().strftime("%Y-%m-%dT%H:%M:%S") +else: + build_time = datetime.datetime.now(datetime.timezone.utc).strftime("%Y-%m-%dT%H:%M:%S") +report_success("SPECS_BUILD_TIME", build_time) + # Write the header file with open("utils/build_info.h", "w") as f: f.write('#define SPECS_BUILD_COMMIT "{}"\n'.format(build_commit)) diff --git a/specs/src/processing/Config.cc b/specs/src/processing/Config.cc index a08d9f4..ca62593 100644 --- a/specs/src/processing/Config.cc +++ b/specs/src/processing/Config.cc @@ -148,7 +148,7 @@ static std::string getTerminalRowsAndColumns(bool bGetRows) } -void readConfigurationFile() +void readConfigurationFile(useKeyValueCB cb) { std::string line; unsigned int lineCounter = 0; @@ -194,7 +194,11 @@ void readConfigurationFile() value = line.substr(idx2, idx-idx2); } - useKeyValue(key, value); + if (cb) { + (*cb)(key, value); + } else { + useKeyValue(key, value); + } } } else { } @@ -241,7 +245,7 @@ void readConfigurationFile() } } } - build_info += " at " + ExternalLiterals["build-time"]; + build_info += " at " + ExternalLiterals["build-time"] + (ExternalLiterals["build-source"] == "github" ? " UTC" : " local"); ExternalLiterals["build-info"] = build_info; } diff --git a/specs/src/processing/Config.h b/specs/src/processing/Config.h index 85da620..51eda5d 100644 --- a/specs/src/processing/Config.h +++ b/specs/src/processing/Config.h @@ -61,7 +61,8 @@ CONFIG_PARAMS #define EXTERNAL_FUNC_ERR_ZERO "zero" #define EXTERNAL_FUNC_ERR_NULLSTR "nullstr" -void readConfigurationFile(); +typedef void (*useKeyValueCB)(std::string& key, std::string& value); +void readConfigurationFile(useKeyValueCB cb = nullptr); bool configSpecLiteralExists(std::string& key); diff --git a/specs/src/test/specs-autocomplete.cc b/specs/src/test/specs-autocomplete.cc index ead92ea..3d695e3 100644 --- a/specs/src/test/specs-autocomplete.cc +++ b/specs/src/test/specs-autocomplete.cc @@ -1,5 +1,6 @@ #include #include +#include #include #include #include "processing/Config.h" @@ -7,6 +8,26 @@ typedef std::vector StringVector; +StringVector SystemDefinedLabels = { + "@version", + "@cols", + "@rows", + "@python", + "@platform", + "@build-commit", + "@build-branch", + "@build-time", + "@build-source", + "@build-number", + "@build-info" +}; + +static void AddToSystemDefineLabels(std::string& key, std::string& value) +{ + key.pop_back(); // remove the final colon + SystemDefinedLabels.push_back("@" + key); +} + void GetFilesByPrefix(StringVector& sv, const char* path, std::string& prefix) { std::filesystem::path specPath(path); @@ -35,8 +56,32 @@ static void getFilenameVector(StringVector& sv, std::string& incomplete, std::st } } +int CompleteUncertain_SystemLabels(std::string& incomplete) +{ + readConfigurationFile(AddToSystemDefineLabels); + StringVector vec; + for (auto s : SystemDefinedLabels) { + if (0==s.compare(0, incomplete.size(), incomplete)) { + vec.push_back(s); + } + } + + // sort the vector + std::sort(vec.begin(), vec.end()); + + for (auto s : vec) { + std::cout << s << "\n"; + } + + return 0; +} + int CompleteUncertain(std::string& incomplete, std::string& prevToken, std::string& line) { + if ('@'==incomplete[0]) { + return CompleteUncertain_SystemLabels(incomplete); + } + StringVector sv; getFilenameVector(sv, incomplete, prevToken); @@ -48,8 +93,33 @@ int CompleteUncertain(std::string& incomplete, std::string& prevToken, std::stri return 0; } +int CompleteIfUnambiguous_SystemLabels(std::string& incomplete) +{ + readConfigurationFile(AddToSystemDefineLabels); + std::string res; + for (auto s : SystemDefinedLabels) { + if (0==s.compare(0, incomplete.size(), incomplete)) { + if (res.empty()) { // First match + res = s; + } else { // not-first match + // trim non-matching characters + while (s.compare(0,res.size(), res)) { + res.pop_back(); + } + } + } + } + + std::cout << res; + return 0; +} + int CompleteIfUnambiguous(std::string& incomplete, std::string& prevToken, std::string& line) { + if ('@'==incomplete[0]) { + return CompleteIfUnambiguous_SystemLabels(incomplete); + } + StringVector sv; getFilenameVector(sv, incomplete, prevToken); From 48b7493746b9f0a2acf6f43066cadab6d687beb6 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Thu, 11 Jun 2026 10:52:12 +0300 Subject: [PATCH 44/50] Issue #432 - Enable specs auto-complete in Linux installers (#433) --- .github/packaging/specs-completion.bash | 2 ++ .github/packaging/specs.spec.in | 3 +++ .github/workflows/release.yml | 3 +++ .gitignore | 4 ++++ specs/src/setup.py | 3 ++- 5 files changed, 14 insertions(+), 1 deletion(-) create mode 100644 .github/packaging/specs-completion.bash diff --git a/.github/packaging/specs-completion.bash b/.github/packaging/specs-completion.bash new file mode 100644 index 0000000..b8a4a84 --- /dev/null +++ b/.github/packaging/specs-completion.bash @@ -0,0 +1,2 @@ +# bash completion for specs, provided by the specs-autocomplete helper. +complete -o bashdefault -o default -o nospace -C specs-autocomplete specs diff --git a/.github/packaging/specs.spec.in b/.github/packaging/specs.spec.in index 1d64242..4577ea1 100644 --- a/.github/packaging/specs.spec.in +++ b/.github/packaging/specs.spec.in @@ -21,9 +21,11 @@ multiple lines into single lines or vice versa. mkdir -p %{buildroot}/usr/local/bin mkdir -p %{buildroot}/usr/share/specs mkdir -p %{buildroot}/usr/lib/specs +mkdir -p %{buildroot}/etc/bash_completion.d install -m 755 specs %{buildroot}/usr/local/bin/specs install -m 755 specs-autocomplete %{buildroot}/usr/local/bin/specs-autocomplete install -m 644 specs.1.gz %{buildroot}/usr/share/specs/specs.1.gz +install -m 644 specs-completion.bash %{buildroot}/etc/bash_completion.d/specs cp -r python %{buildroot}/usr/lib/specs/ %post @@ -91,4 +93,5 @@ fi /usr/local/bin/specs /usr/local/bin/specs-autocomplete /usr/share/specs/specs.1.gz +/etc/bash_completion.d/specs /usr/lib/specs/python diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index f8697d4..58c2e0f 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -78,6 +78,7 @@ jobs: cp specs/exe/specs rpmbuild/SOURCES/specs-${SPECS_VERSION#v}/ cp specs/exe/specs-autocomplete rpmbuild/SOURCES/specs-${SPECS_VERSION#v}/ cp specs.1.gz rpmbuild/SOURCES/specs-${SPECS_VERSION#v}/ + cp .github/packaging/specs-completion.bash rpmbuild/SOURCES/specs-${SPECS_VERSION#v}/ - name: Bundle Python stdlib for RPM run: | @@ -405,12 +406,14 @@ jobs: mkdir -p deb-root/usr/local/bin mkdir -p deb-root/usr/share/specs mkdir -p deb-root/usr/lib/specs + mkdir -p deb-root/etc/bash_completion.d mkdir -p deb-root/DEBIAN cp specs/exe/specs deb-root/usr/local/bin/ cp specs/exe/specs-autocomplete deb-root/usr/local/bin/ chmod 755 deb-root/usr/local/bin/specs chmod 755 deb-root/usr/local/bin/specs-autocomplete cp specs.1.gz deb-root/usr/share/specs/ + install -m 644 .github/packaging/specs-completion.bash deb-root/etc/bash_completion.d/specs sed "s/@VERSION@/${SPECS_VERSION#v}/g; s/Architecture: amd64/Architecture: ${{ matrix.arch }}/g" .github/packaging/control.in > deb-root/DEBIAN/control cp .github/packaging/postinst deb-root/DEBIAN/ cp .github/packaging/postrm deb-root/DEBIAN/ diff --git a/.gitignore b/.gitignore index 3ae0f58..510502f 100644 --- a/.gitignore +++ b/.gitignore @@ -51,3 +51,7 @@ specs/src/gdb/__pycache__ # Generated build info specs/src/utils/build_info.h + +# AI agents +.devin + diff --git a/specs/src/setup.py b/specs/src/setup.py index 5f6977e..654c6b8 100644 --- a/specs/src/setup.py +++ b/specs/src/setup.py @@ -254,7 +254,8 @@ def python_search(arg): $(MKDIR_C) /usr/local/share/man/man1 cp specs.1.gz /usr/local/share/man/man1/ /bin/rm specs.1.gz - grep -v "complete -o bashdefault -o default -o nospace -C specs-autocomplete specs" BASHRC | /usr/local/bin/specs -o BASHRC 1-* 1 EOF "complete -o bashdefault -o default -o nospace -C specs-autocomplete specs" + $(MKDIR_C) /etc/bash_completion.d + cp ../../.github/packaging/specs-completion.bash /etc/bash_completion.d/specs install_win: $(EXE_DIR)/specs.exe echo "Please copy the file specs.exe in the EXE dir to a location on the PATH" From 71df2cf27864ef09003cb25a0144f797d4aed9f5 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Thu, 11 Jun 2026 15:03:13 +0300 Subject: [PATCH 45/50] Issue #424 - build-runid and build-url (#434) --- .github/workflows/c-cpp.yml | 4 +++- .github/workflows/release.yml | 12 +++++++----- manpage | 6 +++++- specs/docs/alu.md | 4 +++- specs/docs/onepage.md | 2 ++ specs/src/build_info.targets | 4 ++++ specs/src/generate_build_info.py | 6 ++++++ specs/src/processing/Config.cc | 7 +++++-- specs/src/test/specs-autocomplete.cc | 2 ++ 9 files changed, 37 insertions(+), 10 deletions(-) diff --git a/.github/workflows/c-cpp.yml b/.github/workflows/c-cpp.yml index 005db0c..489d906 100644 --- a/.github/workflows/c-cpp.yml +++ b/.github/workflows/c-cpp.yml @@ -9,6 +9,8 @@ on: env: SPECS_BRANCH: ${{ github.event.pull_request.base.ref || github.ref_name }} SPECS_BUILD_SOURCE: github + SPECS_BUILD_RUNID: ${{ github.run_id }} + SPECS_BUILD_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }} jobs: build-linux: @@ -40,7 +42,7 @@ jobs: run: make all - name: Test specs executable - run: specs/exe/specs "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" + run: specs/exe/specs "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" WRITE "Build URL:" 1 "@build-url" - name: make check working-directory: specs/src diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index 58c2e0f..08b2a64 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -14,6 +14,8 @@ env: SPECS_BRANCH: ${{ github.event.release.target_commitish }} SPECS_BUILD_SOURCE: github SPECS_BUILD_NUMBER: ${{ github.run_number }} + SPECS_BUILD_RUNID: ${{ github.run_id }} + SPECS_BUILD_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }} jobs: build-linux: @@ -56,7 +58,7 @@ jobs: run: make some - name: Verify binary - run: specs/exe/specs "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" + run: specs/exe/specs "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" WRITE "Build URL:" 1 "@build-url" - name: Prepare manpage run: | @@ -132,7 +134,7 @@ jobs: run: make some - name: Verify binary - run: specs/exe/specs "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" + run: specs/exe/specs "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" WRITE "Build URL:" 1 "@build-url" - name: Prepare manpage run: | @@ -216,7 +218,7 @@ jobs: run: msbuild specs/specs.sln /p:Configuration=Release /p:Platform=x64 /p:GitTag=${{ steps.version.outputs.display }} - name: Verify binary - run: specs\bin\Release\specs.exe "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" + run: specs\bin\Release\specs.exe "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" WRITE "Build URL:" 1 "@build-url" - name: Prepare standalone executable shell: bash @@ -302,7 +304,7 @@ jobs: run: msbuild specs/specs.sln /p:Configuration=Release /p:Platform=x64 /p:GitTag=${{ steps.version.outputs.display }} /p:EnablePython=true - name: Verify binary - run: specs\bin\Release\specs.exe "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" + run: specs\bin\Release\specs.exe "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" WRITE "Build URL:" 1 "@build-url" - name: Prepare standalone executable shell: bash @@ -390,7 +392,7 @@ jobs: run: make some - name: Verify binary - run: specs/exe/specs "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" + run: specs/exe/specs "Version:" 1 "@version" WRITE "Platform:" 1 "@platform" WRITE "Build info:" 1 "@build-info" WRITE "Build URL:" 1 "@build-url" - name: Prepare manpage run: | diff --git a/manpage b/manpage index 9dd9849..21c6694 100644 --- a/manpage +++ b/manpage @@ -2315,6 +2315,10 @@ or depending on where the build was created. .IP "build-number" 3 contains the build number from GitHub Actions (github.run_number). Empty for local builds. +.IP "build-runid" 3 +contains the 11-digit build id from GitHub Actions (github.run_id). Empty for local builds. +.IP "build-url" 3 +contains the build URL for the GitHub build. Empty for local builds. .IP "build-info" 3 contains a composite string with all build information. For example: .p @@ -2322,7 +2326,7 @@ contains a composite string with all build information. For example: .p or .p - Built on github (build 217) from commit 8bd11da of version 1.0.0 at 2026-06-08T10:01:18 UTC + Built on github (id 27334782912; build 217) from commit 8bd11da of version 1.0.0 at 2026-06-08T10:01:18 UTC .p .SH EXAMPLES diff --git a/specs/docs/alu.md b/specs/docs/alu.md index 373a9fa..2e1b7d5 100644 --- a/specs/docs/alu.md +++ b/specs/docs/alu.md @@ -123,10 +123,12 @@ Build information is also available via the following labels: - `@build-time` — the timestamp when the build was created (format: `yyyy-MM-ddTHH:mm:ss`). It's local time for local builds, or UTC for GitHub builds. - `@build-source` — either `local` or `github` - `@build-number` — the GitHub Actions build number (empty for local builds) +- `@build-runid` — the 11-digit GitHub Actions run id (empty for local builds) +- `@build-url` — the build URL for the GitHub build (Empty for local builds), e.g: [https://github.com/yoavnir/specs2016/actions/runs/27334782912](https://github.com/yoavnir/specs2016/actions/runs/27334782912). - `@build-info` — a composite string with all build information, e.g.: ``` Built locally from commit 8bd11da on branch dev-1.0.0 at 2026-06-08T13:01:18 local - Built on github (build 217) from commit 8bd11da of version 1.0.0 at 2026-06-08T10:01:18 UTC + Built on github (id 27334782912; build 217) from commit 8bd11da of version 1.0.0 at 2026-06-08T10:01:18 UTC ``` Additionally, the `@@` string stands for the entire input record. When rolling context is in effect (see [Streams and Records](streams.md#rolling-context)), `@@` always refers to the original input record. The `@!` string refers to the current record as affected by `CONTEXT`, which is the same as `@@` when no `CONTEXT` is active. The `@-n` and `@+n` syntax is an alternative to using that is effective within expressions. Note that reading beyond the input with `@+n` or `@-n` does not cause processing to stop, even if a `READSTOP` token is present in the specification. The following three specifications are equivalent: diff --git a/specs/docs/onepage.md b/specs/docs/onepage.md index 1a4c903..22387ac 100644 --- a/specs/docs/onepage.md +++ b/specs/docs/onepage.md @@ -205,6 +205,8 @@ There are some pre-configured labels that do not need to be explicitly defined: * build-time - contains the timestamp when the build was created (format: `yyyy-MM-ddTHH:mm:ss`). It's local time for local builds, or UTC for GitHub builds. * build-source - contains either `local` or `github` * build-number - contains the GitHub Actions build number (empty for local builds) +* build-runid - contains the 11-digit GitHub Actions run id (empty for local builds) +* build-url - contains the build URL for the GitHub build (Empty for local builds) * build-info - contains a composite string with all build information Examples diff --git a/specs/src/build_info.targets b/specs/src/build_info.targets index 7e1e622..c002637 100644 --- a/specs/src/build_info.targets +++ b/specs/src/build_info.targets @@ -8,6 +8,8 @@ local $(SPECS_BUILD_SOURCE) $(SPECS_BUILD_NUMBER) + $(SPECS_BUILD_RUNID) + $(SPECS_BUILD_URL) @@ -38,6 +40,8 @@ + + diff --git a/specs/src/generate_build_info.py b/specs/src/generate_build_info.py index 32a2ade..4d574bb 100644 --- a/specs/src/generate_build_info.py +++ b/specs/src/generate_build_info.py @@ -60,6 +60,10 @@ def run_git(name, args): report_success("SPECS_BUILD_SOURCE", build_source) build_number = os.environ.get("SPECS_BUILD_NUMBER", "") report_success("SPECS_BUILD_NUMBER", build_number) +build_runid = os.environ.get("SPECS_BUILD_RUNID", "") +report_success("SPECS_BUILD_RUNID", build_runid) +build_url = os.environ.get("SPECS_BUILD_URL", "") +report_success("SPECS_BUILD_URL", build_url) # Get UTC build time if build_source == "local": @@ -75,5 +79,7 @@ def run_git(name, args): f.write('#define SPECS_BUILD_TIME "{}"\n'.format(build_time)) f.write('#define SPECS_BUILD_SOURCE "{}"\n'.format(build_source)) f.write('#define SPECS_BUILD_NUMBER "{}"\n'.format(build_number)) + f.write('#define SPECS_BUILD_RUNID "{}"\n'.format(build_runid)) + f.write('#define SPECS_BUILD_URL "{}"\n'.format(build_url)) print("Generated utils/build_info.h") diff --git a/specs/src/processing/Config.cc b/specs/src/processing/Config.cc index ca62593..741e71a 100644 --- a/specs/src/processing/Config.cc +++ b/specs/src/processing/Config.cc @@ -224,14 +224,17 @@ void readConfigurationFile(useKeyValueCB cb) ExternalLiterals["build-time"] = dequote(STRINGIFY(SPECS_BUILD_TIME)); ExternalLiterals["build-source"] = dequote(STRINGIFY(SPECS_BUILD_SOURCE)); ExternalLiterals["build-number"] = dequote(STRINGIFY(SPECS_BUILD_NUMBER)); + ExternalLiterals["build-runid"] = dequote(STRINGIFY(SPECS_BUILD_RUNID)); + ExternalLiterals["build-url"] = dequote(STRINGIFY(SPECS_BUILD_URL)); // Compose build-info std::string build_info = "Built "; if (ExternalLiterals["build-source"] == "github") { - build_info += "on github"; + build_info += "on github (id " + ExternalLiterals["build-runid"]; if (!ExternalLiterals["build-number"].empty()) { - build_info += " (build " + ExternalLiterals["build-number"] + ")"; + build_info += "; build " + ExternalLiterals["build-number"]; } + build_info += ")"; } else { build_info += "locally"; } diff --git a/specs/src/test/specs-autocomplete.cc b/specs/src/test/specs-autocomplete.cc index 3d695e3..b9a9b11 100644 --- a/specs/src/test/specs-autocomplete.cc +++ b/specs/src/test/specs-autocomplete.cc @@ -19,6 +19,8 @@ StringVector SystemDefinedLabels = { "@build-time", "@build-source", "@build-number", + "@build-runid", + "@build-url", "@build-info" }; From 8da496b288f156d13549102907d6907ea00f90ca Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Thu, 11 Jun 2026 17:44:29 +0300 Subject: [PATCH 46/50] Issue #431 - Add Mac OS auto-completion (#435) --- .github/packaging/specs-completion.zsh | 17 +++++++++++++++++ specs/src/setup.py | 2 ++ specs/src/test/specs-autocomplete.cc | 2 +- 3 files changed, 20 insertions(+), 1 deletion(-) create mode 100644 .github/packaging/specs-completion.zsh diff --git a/.github/packaging/specs-completion.zsh b/.github/packaging/specs-completion.zsh new file mode 100644 index 0000000..1d49994 --- /dev/null +++ b/.github/packaging/specs-completion.zsh @@ -0,0 +1,17 @@ +#compdef specs + +_specs() { + local cur prev + local -a completions + + cur="${words[CURRENT]}" + prev="${words[CURRENT-1]}" + + # Call specs-autocomplete and capture newline-separated results + completions=("${(@f)$(specs-autocomplete specs "$cur" "$prev")}") + + compadd -- $completions +} + +_specs "$@" + diff --git a/specs/src/setup.py b/specs/src/setup.py index 654c6b8..77c9cff 100644 --- a/specs/src/setup.py +++ b/specs/src/setup.py @@ -242,10 +242,12 @@ def python_search(arg): install_mac: $(EXE_DIR)/specs specs.1.gz cp $(EXE_DIR)/specs /usr/local/bin/ + cp $(EXE_DIR)/specs-autocomplete /usr/local/bin/ /bin/rm */*.d $(MKDIR_C) /usr/local/share/man/man1 cp specs.1.gz /usr/local/share/man/man1/ /bin/rm specs.1.gz + cp ../../.github/packaging/specs-completion.zsh /usr/local/share/zsh/site-functions/_specs install_linux: $(EXE_DIR)/specs specs.1.gz cp $(EXE_DIR)/specs /usr/local/bin/ diff --git a/specs/src/test/specs-autocomplete.cc b/specs/src/test/specs-autocomplete.cc index b9a9b11..557842d 100644 --- a/specs/src/test/specs-autocomplete.cc +++ b/specs/src/test/specs-autocomplete.cc @@ -172,7 +172,7 @@ int main(int argc, char** argv) try { cursorPos = std::stoul(safe); } catch (const std::exception&) {} } safe = getenv("COMP_TYPE"); - char type = '\t'; + char type = '?'; if (safe) { try { type = char(std::stoi(safe)); } catch (const std::exception&) {} } From 91cb9b5ada11ddd14cc541d923acfcb9015ee8d0 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Sun, 14 Jun 2026 17:41:54 +0300 Subject: [PATCH 47/50] Issue #431 - Add autocomplete in installation (#436) --- .github/packaging/postinstall_macos | 30 +++++++++++++++++++++++++++++ .github/workflows/release.yml | 9 +++++++++ specs/src/setup.py | 2 ++ 3 files changed, 41 insertions(+) create mode 100755 .github/packaging/postinstall_macos diff --git a/.github/packaging/postinstall_macos b/.github/packaging/postinstall_macos new file mode 100755 index 0000000..31f45d2 --- /dev/null +++ b/.github/packaging/postinstall_macos @@ -0,0 +1,30 @@ +#!/bin/bash +# Enable zsh completion for specs by making sure /etc/zshrc loads the +# completion functions installed under /usr/local/share/zsh/site-functions. +# +# This script is used both by the macOS .pkg installer (as its postinstall +# script) and by the local "make install" target, so the behaviour stays +# identical between the two installation paths. +# +# It is idempotent: the whole block is wrapped in sentinel markers and is only +# appended when those markers are not already present, so running it any number +# of times never produces duplicate fpath entries or duplicate compinit calls. +set -e + +ZSHRC="/etc/zshrc" +MARKER="# >>> specs completion >>>" + +if ! grep -qF "$MARKER" "$ZSHRC" 2>/dev/null; then + cat >> "$ZSHRC" <<'EOF' + +# >>> specs completion >>> +typeset -U fpath +fpath=(/usr/local/share/zsh/site-functions $fpath) + +autoload -Uz compinit +compinit +# <<< specs completion <<< +EOF +fi + +exit 0 diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index 08b2a64..e87c2e8 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -149,16 +149,25 @@ jobs: run: | mkdir -p pkg-root/usr/local/bin mkdir -p pkg-root/usr/local/share/man/man1 + mkdir -p pkg-root/usr/local/share/zsh/site-functions cp specs/exe/specs pkg-root/usr/local/bin/ cp specs/exe/specs-autocomplete pkg-root/usr/local/bin/ chmod 755 pkg-root/usr/local/bin/specs chmod 755 pkg-root/usr/local/bin/specs-autocomplete cp specs.1.gz pkg-root/usr/local/share/man/man1/ + cp .github/packaging/specs-completion.zsh pkg-root/usr/local/share/zsh/site-functions/_specs + + - name: Prepare pkg scripts + run: | + mkdir -p pkg-scripts + cp .github/packaging/postinstall_macos pkg-scripts/postinstall + chmod 755 pkg-scripts/postinstall - name: Build .pkg run: | pkgbuild \ --root pkg-root \ + --scripts pkg-scripts \ --identifier com.github.yoavnir.specs \ --version "${SPECS_VERSION#v}" \ --install-location / \ diff --git a/specs/src/setup.py b/specs/src/setup.py index 77c9cff..7bab222 100644 --- a/specs/src/setup.py +++ b/specs/src/setup.py @@ -247,7 +247,9 @@ def python_search(arg): $(MKDIR_C) /usr/local/share/man/man1 cp specs.1.gz /usr/local/share/man/man1/ /bin/rm specs.1.gz + $(MKDIR_C) /usr/local/share/zsh/site-functions cp ../../.github/packaging/specs-completion.zsh /usr/local/share/zsh/site-functions/_specs + /bin/bash ../../.github/packaging/postinstall_macos install_linux: $(EXE_DIR)/specs specs.1.gz cp $(EXE_DIR)/specs /usr/local/bin/ From 843db6551ecebcb697ecab1310db31f5cfd31ba6 Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Sun, 21 Jun 2026 17:19:18 +0300 Subject: [PATCH 48/50] Issue #437 - Add bundled Python 3.12 (#441) --- .github/packaging/specs-python.wxs.in | 26 ++-- .github/packaging/specs.spec.in | 7 +- .github/workflows/release.yml | 172 +++++++++++++++++++++++--- PYTHON_LICENSE | 42 +++++++ README.md | 2 +- specs/src/setup.py | 29 +++++ specs/src/utils/PythonIntf.cc | 50 ++++++-- 7 files changed, 290 insertions(+), 38 deletions(-) create mode 100644 PYTHON_LICENSE diff --git a/.github/packaging/specs-python.wxs.in b/.github/packaging/specs-python.wxs.in index 1cbf1c1..8d3fabc 100644 --- a/.github/packaging/specs-python.wxs.in +++ b/.github/packaging/specs-python.wxs.in @@ -13,17 +13,27 @@ - - - - - - + + + + + + + + + + + - + + diff --git a/.github/packaging/specs.spec.in b/.github/packaging/specs.spec.in index 4577ea1..128ec27 100644 --- a/.github/packaging/specs.spec.in +++ b/.github/packaging/specs.spec.in @@ -6,7 +6,7 @@ License: MIT URL: https://github.com/yoavnir/specs2016 Source0: specs-%{version}.tar.gz -# Disable automatic dependency detection since Python is statically linked +# Disable automatic dependency detection since Python is bundled AutoReqProv: no %description @@ -20,12 +20,15 @@ multiple lines into single lines or vice versa. %install mkdir -p %{buildroot}/usr/local/bin mkdir -p %{buildroot}/usr/share/specs +mkdir -p %{buildroot}/usr/share/doc/specs mkdir -p %{buildroot}/usr/lib/specs mkdir -p %{buildroot}/etc/bash_completion.d install -m 755 specs %{buildroot}/usr/local/bin/specs install -m 755 specs-autocomplete %{buildroot}/usr/local/bin/specs-autocomplete install -m 644 specs.1.gz %{buildroot}/usr/share/specs/specs.1.gz install -m 644 specs-completion.bash %{buildroot}/etc/bash_completion.d/specs +install -m 644 docs/LICENSE %{buildroot}/usr/share/doc/specs/LICENSE +install -m 644 docs/PYTHON_LICENSE %{buildroot}/usr/share/doc/specs/PYTHON_LICENSE cp -r python %{buildroot}/usr/lib/specs/ %post @@ -94,4 +97,6 @@ fi /usr/local/bin/specs-autocomplete /usr/share/specs/specs.1.gz /etc/bash_completion.d/specs +/usr/share/doc/specs/LICENSE +/usr/share/doc/specs/PYTHON_LICENSE /usr/lib/specs/python diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index c22e70a..df64744 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -83,18 +83,34 @@ jobs: cp specs.1.gz rpmbuild/SOURCES/specs-${SPECS_VERSION#v}/ cp .github/packaging/specs-completion.bash rpmbuild/SOURCES/specs-${SPECS_VERSION#v}/ - - name: Bundle Python stdlib for RPM + - name: Bundle Python stdlib and shared library for RPM run: | - PYVER=$(python3 -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')") + PYVER=$(python3.12 -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')") + # Use the platlibdir reported by the bundled Python (e.g. "lib" on + # Debian/Ubuntu, "lib64" on Fedora/RHEL-based) so the bundled stdlib + # lands in the directory that libpython will actually search. + PLATLIBDIR=$(python3.12 -c "import sys; print(sys.platlibdir)") SRCDIR=rpmbuild/SOURCES/specs-${SPECS_VERSION#v} - mkdir -p ${SRCDIR}/python/lib/python${PYVER} - cp -r /usr/lib/python${PYVER}/* ${SRCDIR}/python/lib/python${PYVER}/ + mkdir -p ${SRCDIR}/python/${PLATLIBDIR}/python${PYVER} + cp -r /usr/${PLATLIBDIR}/python${PYVER}/* ${SRCDIR}/python/${PLATLIBDIR}/python${PYVER}/ 2>/dev/null || \ + cp -r /usr/lib/python${PYVER}/* ${SRCDIR}/python/${PLATLIBDIR}/python${PYVER}/ + # Copy the shared libpython library into the same platlibdir subtree + cp -L /usr/${PLATLIBDIR}/libpython${PYVER}.so.1.0 ${SRCDIR}/python/${PLATLIBDIR}/ 2>/dev/null || \ + cp -L /usr/lib/x86_64-linux-gnu/libpython${PYVER}.so.1.0 ${SRCDIR}/python/${PLATLIBDIR}/ 2>/dev/null || \ + cp -L /usr/lib/aarch64-linux-gnu/libpython${PYVER}.so.1.0 ${SRCDIR}/python/${PLATLIBDIR}/ 2>/dev/null || true # Remove unnecessary files to reduce package size - find ${SRCDIR}/python/lib/python${PYVER} -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null || true - find ${SRCDIR}/python/lib/python${PYVER} -type f -name "*.pyc" -delete - find ${SRCDIR}/python/lib/python${PYVER} -type f -name "*.pyo" -delete - rm -rf ${SRCDIR}/python/lib/python${PYVER}/site-packages - rm -rf ${SRCDIR}/python/lib/python${PYVER}/dist-packages + find ${SRCDIR}/python/${PLATLIBDIR}/python${PYVER} -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null || true + find ${SRCDIR}/python/${PLATLIBDIR}/python${PYVER} -type f -name "*.pyc" -delete + find ${SRCDIR}/python/${PLATLIBDIR}/python${PYVER} -type f -name "*.pyo" -delete + rm -rf ${SRCDIR}/python/${PLATLIBDIR}/python${PYVER}/site-packages + rm -rf ${SRCDIR}/python/${PLATLIBDIR}/python${PYVER}/dist-packages + + - name: Copy license files for RPM + run: | + SRCDIR=rpmbuild/SOURCES/specs-${SPECS_VERSION#v} + mkdir -p ${SRCDIR}/docs + cp LICENSE ${SRCDIR}/docs/ + cp PYTHON_LICENSE ${SRCDIR}/docs/ - name: Create RPM tarball run: | @@ -151,12 +167,51 @@ jobs: mkdir -p pkg-root/usr/local/bin mkdir -p pkg-root/usr/local/share/man/man1 mkdir -p pkg-root/usr/local/share/zsh/site-functions + mkdir -p pkg-root/usr/local/share/doc/specs cp specs/exe/specs pkg-root/usr/local/bin/ cp specs/exe/specs-autocomplete pkg-root/usr/local/bin/ chmod 755 pkg-root/usr/local/bin/specs chmod 755 pkg-root/usr/local/bin/specs-autocomplete cp specs.1.gz pkg-root/usr/local/share/man/man1/ cp .github/packaging/specs-completion.zsh pkg-root/usr/local/share/zsh/site-functions/_specs + cp LICENSE pkg-root/usr/local/share/doc/specs/ + cp PYTHON_LICENSE pkg-root/usr/local/share/doc/specs/ + + - name: Bundle Python dylib and stdlib for pkg + run: | + PYVER=$(python3.12 -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')") + # Resolve the bundled install prefix used by setup.py (-> PYTHON_STDLIB_PATH) + BUNDLE_PREFIX=/usr/local/lib/specs/python + LIBDIR=$(python3.12 -c "import sysconfig; print(sysconfig.get_config_var('LIBDIR'))") + STDLIB=$(python3.12 -c "import sysconfig; print(sysconfig.get_path('stdlib'))") + mkdir -p pkg-root${BUNDLE_PREFIX}/lib/python${PYVER} + # Locate and copy the shared libpython dylib (cp -L resolves any symlink + # into the framework's real Mach-O so the bundled copy is standalone) + SRC_DYLIB="${LIBDIR}/libpython${PYVER}.dylib" + if [ ! -e "${SRC_DYLIB}" ]; then + SRC_DYLIB=$(find "$(python3.12 -c "import sys; print(sys.base_prefix)")" -name "libpython${PYVER}.dylib" | head -n1) + fi + cp -L "${SRC_DYLIB}" pkg-root${BUNDLE_PREFIX}/lib/libpython${PYVER}.dylib + # Copy the standard library (includes lib-dynload extension modules) + cp -R "${STDLIB}/" pkg-root${BUNDLE_PREFIX}/lib/python${PYVER}/ + # Trim files that are not needed at runtime to reduce package size + find pkg-root${BUNDLE_PREFIX}/lib/python${PYVER} -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null || true + find pkg-root${BUNDLE_PREFIX}/lib/python${PYVER} -type f -name "*.pyc" -delete + rm -rf pkg-root${BUNDLE_PREFIX}/lib/python${PYVER}/site-packages + rm -rf pkg-root${BUNDLE_PREFIX}/lib/python${PYVER}/test + # Repoint the binary at the bundled dylib so it no longer needs system Python + OLDREF=$(otool -L pkg-root/usr/local/bin/specs | awk 'NR>1 && (/libpython'${PYVER}'/ || /Python\.framework/){print $1; exit}') + NEWREF=${BUNDLE_PREFIX}/lib/libpython${PYVER}.dylib + echo "Repointing libpython reference: ${OLDREF} -> ${NEWREF}" + install_name_tool -change "${OLDREF}" "${NEWREF}" pkg-root/usr/local/bin/specs + install_name_tool -id "${NEWREF}" pkg-root${BUNDLE_PREFIX}/lib/libpython${PYVER}.dylib + chmod 755 pkg-root${BUNDLE_PREFIX}/lib/libpython${PYVER}.dylib + # install_name_tool invalidates the (ad-hoc) code signature; re-sign so the + # dynamic loader will accept the binaries on Apple Silicon. + codesign --force --sign - pkg-root${BUNDLE_PREFIX}/lib/libpython${PYVER}.dylib + codesign --force --sign - pkg-root/usr/local/bin/specs + # Sanity check that the reference now points at the bundled copy + otool -L pkg-root/usr/local/bin/specs | grep -i "libpython${PYVER}" - name: Prepare pkg scripts run: | @@ -320,9 +375,76 @@ jobs: shell: bash run: cp specs/bin/Release/specs.exe specs-${{ steps.version.outputs.display }}-python312-windows-x64.exe + - name: Stage MSI payload (bundle Python runtime) + shell: bash + run: | + # Use forward slashes so the paths are safe to use inside bash + PREFIX=$(python -c "import sys; print(sys.base_prefix.replace(chr(92), '/'))") + PYVER=$(python -c "import sys; print(f'{sys.version_info.major}{sys.version_info.minor}')") + mkdir -p msi-stage + cp specs/bin/Release/specs.exe msi-stage/ + # The Python DLLs must sit next to specs.exe so Windows loads them from + # the application directory without any system Python installation. + cp "${PREFIX}/python${PYVER}.dll" msi-stage/ + cp "${PREFIX}/python3.dll" msi-stage/ 2>/dev/null || true + cp "${PREFIX}/vcruntime140.dll" msi-stage/ 2>/dev/null || true + cp "${PREFIX}/vcruntime140_1.dll" msi-stage/ 2>/dev/null || true + # The C extension modules (.pyd) and the pure-Python standard library. + cp -r "${PREFIX}/DLLs" msi-stage/DLLs + cp -r "${PREFIX}/Lib" msi-stage/Lib + cp LICENSE msi-stage/LICENSE.txt + cp PYTHON_LICENSE msi-stage/PYTHON_LICENSE.txt + # Trim files that are not needed at runtime to reduce package size + find msi-stage/Lib -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null || true + rm -rf msi-stage/Lib/site-packages + rm -rf msi-stage/Lib/test + rm -rf msi-stage/Lib/idlelib + rm -rf msi-stage/Lib/tkinter + rm -rf msi-stage/Lib/turtledemo + - name: Install WiX Toolset run: dotnet tool install --global wix --version 4.0.6 + - name: Harvest Python runtime files (generate WiX fragment) + shell: pwsh + run: | + # Enumerate every file under msi-stage\ except specs.exe and emit a + # WiX v4 fragment with one Component/File per file, each with an + # explicit unique Id derived from the full relative path so that files + # with the same name in different directories (e.g. __init__.py) do not + # collide. heat.exe is a separate NuGet package in WiX v4 and not + # available via the dotnet global tool, so we generate the fragment here. + $root = Resolve-Path "msi-stage" + $rootLen = $root.Path.Length + $files = Get-ChildItem -Path $root -Recurse -File | + Where-Object { $_.Name -ne "specs.exe" } + $lines = [System.Collections.Generic.List[string]]::new() + $lines.Add('') + $lines.Add('') + $lines.Add(' ') + $lines.Add(' ') + foreach ($f in $files) { + # Path relative to the workspace root, e.g. msi-stage\Lib\os.py + $rel = $f.FullName.Substring((Get-Location).Path.Length + 1) + # Path relative to msi-stage\, e.g. Lib\os.py (used as Subdirectory) + $sub = $f.FullName.Substring($rootLen + 1) + # Unique XML identifier: replace every non-alphanumeric char with '_' + $id = "f_" + ($sub -replace '[^A-Za-z0-9]', '_') + # Subdirectory of the file relative to msi-stage (empty for top-level) + $dir = Split-Path $sub -Parent + if ($dir) { + $lines.Add(" ") + } else { + $lines.Add(" ") + } + $lines.Add(" ") + $lines.Add(" ") + } + $lines.Add(' ') + $lines.Add(' ') + $lines.Add('') + $lines | Set-Content -Encoding UTF8 msi-stage.wxs + - name: Create WiX source shell: bash run: | @@ -333,7 +455,7 @@ jobs: - name: Build MSI shell: bash - run: wix build -o specs-${{ steps.version.outputs.display }}-python312.msi specs-python.wxs + run: wix build -o specs-${{ steps.version.outputs.display }}-python312.msi specs-python.wxs msi-stage.wxs - name: Upload MSI artifact uses: actions/upload-artifact@v7 @@ -430,17 +552,29 @@ jobs: cp .github/packaging/postinst deb-root/DEBIAN/ cp .github/packaging/postrm deb-root/DEBIAN/ - - name: Bundle Python stdlib for DEB + - name: Bundle Python stdlib and shared library for DEB run: | - PYVER=$(python3 -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')") - mkdir -p deb-root/usr/lib/specs/python/lib/python${PYVER} - cp -r /usr/lib/python${PYVER}/* deb-root/usr/lib/specs/python/lib/python${PYVER}/ + PYVER=$(python3.12 -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')") + # Use the platlibdir reported by the bundled Python so the stdlib lands + # in the directory that libpython will actually search at runtime. + PLATLIBDIR=$(python3.12 -c "import sys; print(sys.platlibdir)") + mkdir -p deb-root/usr/lib/specs/python/${PLATLIBDIR}/python${PYVER} + mkdir -p deb-root/usr/share/doc/specs + cp -r /usr/${PLATLIBDIR}/python${PYVER}/* deb-root/usr/lib/specs/python/${PLATLIBDIR}/python${PYVER}/ 2>/dev/null || \ + cp -r /usr/lib/python${PYVER}/* deb-root/usr/lib/specs/python/${PLATLIBDIR}/python${PYVER}/ + # Copy the shared libpython library into the same platlibdir subtree + cp -L /usr/lib/x86_64-linux-gnu/libpython${PYVER}.so.1.0 deb-root/usr/lib/specs/python/${PLATLIBDIR}/ 2>/dev/null || \ + cp -L /usr/lib/aarch64-linux-gnu/libpython${PYVER}.so.1.0 deb-root/usr/lib/specs/python/${PLATLIBDIR}/ 2>/dev/null || \ + cp -L /usr/lib/libpython${PYVER}.so.1.0 deb-root/usr/lib/specs/python/${PLATLIBDIR}/ 2>/dev/null || true + # Copy license files + install -m 644 LICENSE deb-root/usr/share/doc/specs/LICENSE + install -m 644 PYTHON_LICENSE deb-root/usr/share/doc/specs/PYTHON_LICENSE # Remove unnecessary files to reduce package size - find deb-root/usr/lib/specs/python/lib/python${PYVER} -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null || true - find deb-root/usr/lib/specs/python/lib/python${PYVER} -type f -name "*.pyc" -delete - find deb-root/usr/lib/specs/python/lib/python${PYVER} -type f -name "*.pyo" -delete - rm -rf deb-root/usr/lib/specs/python/lib/python${PYVER}/site-packages - rm -rf deb-root/usr/lib/specs/python/lib/python${PYVER}/dist-packages + find deb-root/usr/lib/specs/python/${PLATLIBDIR}/python${PYVER} -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null || true + find deb-root/usr/lib/specs/python/${PLATLIBDIR}/python${PYVER} -type f -name "*.pyc" -delete + find deb-root/usr/lib/specs/python/${PLATLIBDIR}/python${PYVER} -type f -name "*.pyo" -delete + rm -rf deb-root/usr/lib/specs/python/${PLATLIBDIR}/python${PYVER}/site-packages + rm -rf deb-root/usr/lib/specs/python/${PLATLIBDIR}/python${PYVER}/dist-packages - name: Build DEB run: | diff --git a/PYTHON_LICENSE b/PYTHON_LICENSE new file mode 100644 index 0000000..173f298 --- /dev/null +++ b/PYTHON_LICENSE @@ -0,0 +1,42 @@ +PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2 + +1. This LICENSE AGREEMENT is between the Python Software Foundation ("PSF"), and + the Individual or Organization ("Licensee") accessing and otherwise using this + software ("Python") in source or binary form and its associated documentation. + +2. Subject to the terms and conditions of this License Agreement, PSF hereby + grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, + analyze, test, perform and/or display publicly, prepare derivative works, + distribute, and otherwise use Python alone or in any derivative + version, provided, however, that PSF's License Agreement and PSF's notice of + copyright, i.e., "Copyright © 2001-2023 Python Software Foundation; All Rights + Reserved" are retained in Python alone or in any derivative version + prepared by Licensee. + +3. In the event Licensee prepares a derivative work that is based on or + incorporates Python or any part thereof, and wants to make the + derivative work available to others as provided herein, then Licensee hereby + agrees to include in any such work a brief summary of the changes made to Python. + +4. PSF is making Python available to Licensee on an "AS IS" basis. + PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY WAY OF + EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND DISCLAIMS ANY REPRESENTATION OR + WARRANTY OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE + USE OF PYTHON WILL NOT INFRINGE ANY THIRD PARTY RIGHTS. + +5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON + FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS A RESULT OF + MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON, OR ANY DERIVATIVE + THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF. + +6. This License Agreement will automatically terminate upon a material breach of + its terms and conditions. + +7. Nothing in this License Agreement shall be deemed to create any relationship + of agency, partnership, or joint venture between PSF and Licensee. This License + Agreement does not grant permission to use PSF trademarks or trade name in a + trademark sense to endorse or promote products or services of Licensee, or any + third party. + +8. By copying, installing or otherwise using Python, Licensee agrees + to be bound by the terms and conditions of this License Agreement. diff --git a/README.md b/README.md index 91fa2b1..7433937 100644 --- a/README.md +++ b/README.md @@ -60,7 +60,7 @@ For detailed build instructions covering Linux, Mac OS, and Windows (both `make` Known Issues ============ * Regular expression grammars other than the default `ECMAScript` don't work except on Mac OS. -* On Windows with Python support, `python312.dll` must be in the path (or Python 3.12 must be installed). +* The Python-enabled Windows MSI bundles its own Python 3.12 runtime, so no system Python is required. The standalone Python-enabled `.exe` (downloaded on its own, outside the MSI) still needs `python312.dll` on the path (or Python 3.12 installed). Contributing ============ diff --git a/specs/src/setup.py b/specs/src/setup.py index 7bab222..5843d3a 100644 --- a/specs/src/setup.py +++ b/specs/src/setup.py @@ -741,6 +741,10 @@ def python_search(arg): if CFG_python: condcomp = condcomp + " " + python_cflags + "{}PYTHON_VER_{}".format(def_prefix,python_version) \ + "{}PYTHON_FULL_VER={}".format(def_prefix,full_python_version) + + # Determine if we should bundle Python (only on GitHub CI builds) + bundle_python = (os.environ.get("SPECS_BUILD_SOURCE", "local") == "github") and CFG_python + if args.static_link and platform!="NT": # Statically link libpython so the binary works regardless of the # Python version installed on the target system. @@ -761,6 +765,31 @@ def python_search(arg): # Older libpython static archives may not be PIE-compatible, so disable PIE static_pyldflags.append("-no-pie") condlink = condlink + " " + " ".join(static_pyldflags) + elif bundle_python and platform!="NT": + # Bundle the shared libpython and stdlib, and point the binary at them. + # The install prefix differs per platform: the macOS .pkg installs under + # /usr/local, while the Linux RPM/DEB packages install under /usr. + if sys.platform=="darwin": + bundle_prefix = "/usr/local/lib/specs/python" + else: + bundle_prefix = "/usr/lib/specs/python" + # The rpath must match the platlibdir of the Python being bundled + # (e.g. "lib" on Debian/Ubuntu, "lib64" on Fedora/RHEL) so that the + # dynamic linker finds libpython in the correct subdirectory. + platlibdir = sys.platlibdir + # Define the path where the bundled stdlib will be installed + condcomp = condcomp + '{}PYTHON_STDLIB_PATH=\\"{}\\"'.format(def_prefix, bundle_prefix) + # Add rpath so the bundled libpython is found first. + # On Linux, --disable-new-dtags emits DT_RPATH instead of DT_RUNPATH; + # DT_RPATH is searched before ld.so.cache, ensuring the bundled + # libpython takes precedence over any system-installed libpython3.12. + # macOS uses Apple ld which does not support --disable-new-dtags, and + # does not need it (install_name_tool rewrites the dylib reference). + if sys.platform=="darwin": + rpath_flags = "-Wl,-rpath,{}/lib".format(bundle_prefix) + else: + rpath_flags = "-Wl,--disable-new-dtags,-rpath,{}/{}".format(bundle_prefix, platlibdir) + condlink = condlink + " " + rpath_flags + " " + python_ldflags else: condlink = condlink + " " + python_ldflags else: diff --git a/specs/src/utils/PythonIntf.cc b/specs/src/utils/PythonIntf.cc index 368a239..cb1d34a 100644 --- a/specs/src/utils/PythonIntf.cc +++ b/specs/src/utils/PythonIntf.cc @@ -12,6 +12,7 @@ #include #include #include +#include // Some defines for compatibility #ifdef PYTHON_VER_3 @@ -356,15 +357,46 @@ class PythonFunctionCollection : public ExternalFunctionCollection { return; } // Initialize Python environment -#ifdef PYTHON_STDLIB_PATH - // When Python is statically linked, use PyConfig to set the home - // directory to our bundled stdlib (Py_SetPythonHome was deprecated - // in Python 3.11). - PyConfig config; - PyConfig_InitPythonConfig(&config); - PyConfig_SetBytesString(&config, &config.home, PYTHON_STDLIB_PATH); - Py_InitializeFromConfig(&config); - PyConfig_Clear(&config); +#if defined(WIN64) + // On Windows the MSI bundles the stdlib next to specs.exe (in a "Lib" + // subdirectory) together with pythonXY.dll, so the program runs without + // any system Python installation. Point Python's home at the executable's + // directory when that bundled layout is present; otherwise fall back to + // the default search (e.g. a developer build using a system Python). + bool bundledStdlibInitialized = false; + { + wchar_t exePath[MAX_PATH]; + DWORD exePathLen = GetModuleFileNameW(NULL, exePath, MAX_PATH); + if (exePathLen > 0 && exePathLen < MAX_PATH) { + std::filesystem::path exeDir = std::filesystem::path(exePath).parent_path(); + if (std::filesystem::exists(exeDir / "Lib")) { + PyConfig config; + PyConfig_InitPythonConfig(&config); + PyConfig_SetString(&config, &config.home, exeDir.wstring().c_str()); + Py_InitializeFromConfig(&config); + PyConfig_Clear(&config); + bundledStdlibInitialized = true; + } + } + } + if (!bundledStdlibInitialized) { + Py_Initialize(); + } +#elif defined(PYTHON_STDLIB_PATH) + // When Python is bundled (either statically linked or as a shared library + // with rpath), use PyConfig to set the home directory to our bundled stdlib + // (Py_SetPythonHome was deprecated in Python 3.11). + // Only set the home path if the bundled directory actually exists. + bool use_bundled_stdlib = std::filesystem::exists(PYTHON_STDLIB_PATH); + if (use_bundled_stdlib) { + PyConfig config; + PyConfig_InitPythonConfig(&config); + PyConfig_SetBytesString(&config, &config.home, PYTHON_STDLIB_PATH); + Py_InitializeFromConfig(&config); + PyConfig_Clear(&config); + } else { + Py_Initialize(); + } #else Py_Initialize(); #endif From 75b603a302de8120db43138a47dc623dd525a86a Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Sun, 21 Jun 2026 17:20:27 +0300 Subject: [PATCH 49/50] Issue #440 - add the make uninstall target (#442) --- .github/packaging/postinstall_macos | 1 + .github/packaging/postuninstall_macos | 33 +++++++++++++++++++++++++++ specs/src/setup.py | 22 +++++++++++++++--- 3 files changed, 53 insertions(+), 3 deletions(-) create mode 100755 .github/packaging/postuninstall_macos diff --git a/.github/packaging/postinstall_macos b/.github/packaging/postinstall_macos index 31f45d2..c8a7650 100755 --- a/.github/packaging/postinstall_macos +++ b/.github/packaging/postinstall_macos @@ -25,6 +25,7 @@ autoload -Uz compinit compinit # <<< specs completion <<< EOF + echo "Enabling auto-complete on this machine. Restarting the terminal many be required." fi exit 0 diff --git a/.github/packaging/postuninstall_macos b/.github/packaging/postuninstall_macos new file mode 100755 index 0000000..99a6a40 --- /dev/null +++ b/.github/packaging/postuninstall_macos @@ -0,0 +1,33 @@ +#!/bin/bash +# Disable the zsh completion for specs that was enabled by postinstall_macos. +# +# This script is the counterpart of postinstall_macos. It is used by the local +# "make uninstall" target to undo the change postinstall_macos made to +# /etc/zshrc, so the behaviour stays symmetric with installation. +# +# It is idempotent: the sentinel-wrapped block is only removed when the markers +# are present, so running it any number of times never fails or touches an +# /etc/zshrc that we never modified. +set -e + +ZSHRC="/etc/zshrc" +MARKER="# >>> specs completion >>>" +END_MARKER="# <<< specs completion <<<" + +if grep -qF "$MARKER" "$ZSHRC" 2>/dev/null; then + tmp="$(mktemp)" + # Drop the sentinel-wrapped block as well as any blank line(s) that + # immediately precede it (postinstall_macos inserts one before the block). + awk -v start="$MARKER" -v end="$END_MARKER" ' + $0 == start { blanks = ""; inblock = 1; next } + inblock { if ($0 == end) inblock = 0; next } + /^[[:space:]]*$/ { blanks = blanks $0 "\n"; next } + { printf "%s", blanks; blanks = ""; print } + END { printf "%s", blanks } + ' "$ZSHRC" > "$tmp" + cat "$tmp" > "$ZSHRC" + /bin/rm -f "$tmp" + echo "Disabling auto-complete on this machine" +fi + +exit 0 diff --git a/specs/src/setup.py b/specs/src/setup.py index 5843d3a..3140bce 100644 --- a/specs/src/setup.py +++ b/specs/src/setup.py @@ -263,6 +263,22 @@ def python_search(arg): install_win: $(EXE_DIR)/specs.exe echo "Please copy the file specs.exe in the EXE dir to a location on the PATH" + +uninstall_mac: + /bin/rm -f /usr/local/bin/specs + /bin/rm -f /usr/local/bin/specs-autocomplete + /bin/rm -f /usr/local/share/man/man1/specs.1.gz + /bin/rm -f /usr/local/share/zsh/site-functions/_specs + /bin/bash ../../.github/packaging/postuninstall_macos + +uninstall_linux: + /bin/rm -f /usr/local/bin/specs + /bin/rm -f /usr/local/bin/specs-autocomplete + /bin/rm -f /usr/local/share/man/man1/specs.1.gz + /bin/rm -f /etc/bash_completion.d/specs + +uninstall_win: + echo "Installation on Windows only copies specs.exe to the PATH; nothing to uninstall. Please manually remove specs.exe if desired." """ clear_clean_posix = \ @@ -845,10 +861,10 @@ def python_search(arg): makefile.write("{}\n".format(clear_clean_part)) if sys.platform=="darwin": - makefile.write("{}\n\ninstall: install_mac\n".format(manpart)) + makefile.write("{}\n\ninstall: install_mac\n\nuninstall: uninstall_mac\n".format(manpart)) elif platform=="NT": - makefile.write("install: install_win\n") + makefile.write("install: install_win\n\nuninstall: uninstall_win\n") else: - makefile.write("{}\n\ninstall: install_linux\n".format(manpart)) + makefile.write("{}\n\ninstall: install_linux\n\nuninstall: uninstall_linux\n".format(manpart)) sys.stderr.write("Makefile created.\n") From b02d56a36c227a3b37e7331374fef56e10c5c12e Mon Sep 17 00:00:00 2001 From: Yoav Nir Date: Mon, 22 Jun 2026 00:29:35 +0300 Subject: [PATCH 50/50] Issue #439 - adjust README and branches in c-cpp (#444) --- .github/workflows/c-cpp.yml | 4 ++-- README.md | 33 +++++++++------------------------ 2 files changed, 11 insertions(+), 26 deletions(-) diff --git a/.github/workflows/c-cpp.yml b/.github/workflows/c-cpp.yml index ccd7020..6e66a84 100644 --- a/.github/workflows/c-cpp.yml +++ b/.github/workflows/c-cpp.yml @@ -2,9 +2,9 @@ name: C/C++ CI on: push: - branches: [ dev, stable, dev-0.9.9, dev-1.0.0, dev-nodejs-24] + branches: [ dev, stable, dev-1.0.0, dev-1.1.0] pull_request: - branches: [ dev, stable, dev-0.9.9, dev-1.0.0, dev-nodejs-24] + branches: [ dev, stable, dev-1.0.0, dev-1.1.0] env: SPECS_BRANCH: ${{ github.event.pull_request.base.ref || github.ref_name }} diff --git a/README.md b/README.md index 7433937..22c1c74 100644 --- a/README.md +++ b/README.md @@ -13,26 +13,13 @@ News 11-Sep-2026: Version 1.0.0 is here What's new: - * All pre-built binaries now work with Python 3.12 - * Support Python in `MSBuild` builds - * Added MSI and stand-alone Windows executable to release artifacts - * Debugging aids for GDB - * Rolling context support + * Rolling context allows processing previous and future records. + * Packages for Linux, Mac OS and Windows, bundled with Python 3.12. + * Python support in `MSBuild` builds. + * Exactness in Python function arguments and return value. + * Build information system-defined labels, such as `@build-info` and `@build-url`. + * Debugging aids for GDB. -*** -1-May-2026: Version 0.9.9 is here - -What's new: - * MSI package & standalone executable for Windows - * .pkg package for Mac OS - * RPM for Linux - * .deb package for Ubuntu/Debian - * Visual Studio infra for building for Windows - * Improved guessing of Python version - * New spec units: `SPLITW` and `SPLITF` for splitting input records by words or fields into multiple output records. These new spec units support optional custom separators, `OF` clauses (with the same semantics as SUBSTRING), and range output placement (e.g. `splitw 1-10`). - * A more exact `exact()` function - -*Note:* Installing from package does not include Python support on Windows. Sources ======= @@ -44,9 +31,6 @@ Installation from binaries ========================== The binaries for the latest release can be downloaded from [**the release page**](https://github.com/yoavnir/specs2016/releases/tag/v1.0.0) -**Requirements:** - * **Python 3.12 must be installed on your target machine.** All pre-built binaries (Linux RPM, Linux DEB, macOS .pkg, and Windows MSI/executable) are dynamically linked against Python 3.12. - **Notes:** * On Windows for ARM, you may install the x64 version of Python 3.12. * Recent Mac OS versions are very strict on where packages come from. You may need to issue the following command to get the .pkg file to install: `xattr -dr com.apple.quarantine /path/to/specs-1.0.0.pkg` @@ -55,7 +39,7 @@ Building ======== For detailed build instructions covering Linux, Mac OS, and Windows (both `make` and MSBuild), see [BUILDING.md](BUILDING.md). -**Note on Python versions:** The pre-built binaries are linked against Python 3.12. If you need to use a different version of Python, or if Python 3.12 is not available on your target platform, you must build `specs` locally from source. When building, you can specify which Python version to use via the `--python` option to `setup.py` (on Linux/macOS) or by setting the appropriate Python version in your Visual Studio environment (on Windows). +**Note on Python versions:** The pre-built binaries are linked against Python 3.12, and bundled with it. If you need to use a different version of Python, you can build `specs` locally from source. When building, you can specify which Python version to use via the `--python` option to `setup.py` (on Linux/macOS) or by setting the appropriate Python version in your Visual Studio environment (on Windows). Known Issues ============ @@ -92,4 +76,5 @@ The documentation for *specs2016* exists in two places: License ======= -*specs2016* is licensed under the [MIT License](https://github.com/yoavnir/specs2016/blob/dev/LICENSE). +* *specs2016* is licensed under the [MIT License](https://github.com/yoavnir/specs2016/blob/dev/LICENSE). +* The *Python 3.12* library bundled with the GitHub-built packages is licensed under the [Python Software Foundation License](https://github.com/yoavnir/specs2016/blob/dev-1.0.0/PYTHON_LICENSE)