Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build for Linux ARM64 #174

Open
lfoppiano opened this issue Jan 18, 2025 · 33 comments
Open

Build for Linux ARM64 #174

lfoppiano opened this issue Jan 18, 2025 · 33 comments

Comments

@lfoppiano
Copy link
Collaborator

Follow up from here: kermitt2/grobid#1230 (comment)

@AaronNGray
Copy link

Yes, likewise, it's not top priority for me right now as I am working on the sister project to the search engine crawler at the moment. But hopefully I will get time before you will. I have programmed in C++ since the early days so it should be easy enough to get to the bottom of.

@AaronNGray
Copy link

Okay the status on this issue is someone called Aazhar who is no longer on GitHub added ICU support ab083a6. So this is an unknown and ICU support has just been added without leaving a trace as to what version of ICU it is or were it came from.

I need to look into latest ICU. As AFAICT ARM64/Ubuntu uses 32 bit wchar_t's not 16 bit wchar_t's, which are not supported in icu/common/local/unistr.h.

I need to do a test to verify this.

https://github.com/unicode-org/icu/blob/main/icu4c/source/common/unicode/unistr.h#L3152

@lfoppiano
Copy link
Collaborator Author

I'd assume from here it's version 62, but is not clear.

@AaronNGray
Copy link

Thanks, the source said a really really old version, so was not sure. Sorry I should have read that.

@lfoppiano
Copy link
Collaborator Author

I asked @flydutch to have a look to these libraries to update them and then work to an ARM build. We're also working on an udpate to xpdf 4.05 (see related #173 ) and we should also update the library dependency.
@AaronNGray if you have any partial work done we could continue from where you reach?

I also wish to set up github actions that can build and post the binary somewhere so that we avoid messing up with all these versions/architectures

@AaronNGray
Copy link

AaronNGray commented Feb 18, 2025

@lfoppiano Sorry I had intended to have all this done by now.

Ideally we setup github actions on repos to post binaries to the release pages, we can either clone or do pull requests to those we dont own.

I got pdfalto updated to work with the latest ICU built with ./configure --enable-static and this involved updating pdfalto to C++17.

https://github.com/AaronNGray/pdfalto/tree/master

Theres a pull request here :-

#177

@lfoppiano
Copy link
Collaborator Author

Great. Thanks. I've created a larger branch where we should update all external libraries (if possible), and merge into that.

I think the automatic build will be necessary for this release, BTW I would like to post binaries also in intermediate versions, maybe limited to master, not only releases. I will try to find time to look into it

@AaronNGray
Copy link

I am starting work on doing GitHub Actions for ICU builds for MacOS ARM64 as there does not seem to be a binary release for it and also Ubuntu 22.04 ARM64, so I can get ICU to do releases for them. I set up a GitHub Self Hosted Action on my local ARM64 Ubuntu box today, which was fun.

@lfoppiano
Copy link
Collaborator Author

@AaronNGray thanks, indeed I will need to setup a custom building machine for the different architectures. I reverted back to the github runners (without arm) temporarly, so that the build are running.

@AaronNGray
Copy link

AaronNGray commented Feb 19, 2025

@lfoppiano There is both Ubuntu and WIndows based ARM64 Public repo only GitHub Action Runners, but AFAIK not MacOS :-
https://www.google.com/search?q=github+arm+action+runners

I can run the self hosted GitHub runner on my MacBook for releases if necessary.

@lfoppiano
Copy link
Collaborator Author

I set up macos-12 and macos-14 (ARM). Hopefully it will work, but I'm not sure how to test it.

@AaronNGray
Copy link

AaronNGray commented Feb 20, 2025

I set up macos-12 and macos-14 (ARM). Hopefully it will work, but I'm not sure how to test it.

Unfortunately they are failing https://github.com/kermitt2/pdfalto/actions/runs/13424243421

uconv is missing on :-
https://github.com/kermitt2/pdfalto/actions/runs/13424243421/job/37503837019#step:5:20

https://github.com/AaronNGray/pdfalto/actions/runs/13435341396/job/37536256574#step:4:263

Warning: No available formula with the name "uconv". Did you mean uggconv, ucon64, cconv or unoconv?
uconv is part of the icu4c formula:
  brew install icu4c

but uconv is still not found on either :-

 brew install icu4c

or

 brew reinstall icu4c@76

I put this as a separate issue :- #179

error: call of overloaded ‘UnicodeString(wchar_t)’ is ambiguous
https://github.com/kermitt2/pdfalto/actions/runs/13424243421/job/37503837264#step:8:583
looks like you are using ICU 77 which is not a release yet. Not sure if ICU v76.1 works on ARM64 Ubuntu ?

There appears to be no test case in ICU tests for on ubuntu-24.04-arm, or tests for :-
new UnicodeString(wchar_t(...))

I need to create ARM builds and tests for ICU and test these :-
https://github.com/unicode-org/icu/blob/main/.github/workflows/icu4c.yml#L777

I have a branch for pdfalto with updated ICU with sprintf fixes that builds on MacOS.
master...AaronNGray:pdfalto:snprintf-mods
This is does not run as a GitHub Action Runner test due to the uconv issue, but builds locally.

I am going back to working on getting ICU ARM64 builds in place. Once I have that I will be able to sort out MacOS and Ubuntu ARM builds, but if you want to go ahead on them heres my latest ARM64 Action tests, and results :-

https://github.com/AaronNGray/github-arm64-action-runner-tests/blob/main/.github/workflows/github-action-runner-tests.yml
https://github.com/AaronNGray/github-arm64-action-runner-tests/actions/runs/13433592749

There are no GitHub hosted Windows arm64 action runners. We can still do cross compilation builds for Windows ARM64 targets, as ICU does :-
https://github.com/unicode-org/icu/blob/main/.github/workflows/icu4c.yml#L412

I wasted two days trying to get this in place :( You can do Windows ARM64 tests on Self Hosted platforms though, which I will have in place temporarily for testing at some point hopefully.

@lfoppiano
Copy link
Collaborator Author

@AaronNGray I think is better if you keep working on this ICU since you're already ahead.

Just a couple of points:

  • ICU version, IMHO we should update to the latest ICU version, could you do that if you haven't done it already?
  • Windows build, let's forget about it for the moment. It's too much work to maintain three platforms.

@AaronNGray
Copy link

I think is better if you keep working on this ICU since you're already ahead.

Okay, but this is a show stopper for MacOS, so it might be an idea if you are not working on other things to look into it too if time is an issue.

uconv is missing on

uconv is part of ICU icu4c, so I need to get the MacOS ARM64 ICU version built so we can use that as uconv is part of icu4c's extra's and does not seem to be in the icu4c or icu4c@76 brew installs.

ICU version, IMHO we should update to the latest ICU version, could you do that if you haven't done it already?

You mean 'main' branch rather than latest release :- https://github.com/unicode-org/icu/releases/tag/release-76-1

Windows build, let's forget about it for the moment. It's too much work to maintain three platforms.

Yes, Windows is not a priority for me at all especially not on ARM64.

@lfoppiano
Copy link
Collaborator Author

Okay, but this is a show stopper for MacOS, so it might be an idea if you are not working on other things to look into it too if time is an issue.

uconv is part of ICU icu4c, so I need to get the MacOS ARM64 ICU version built so we can use that as uconv is part of icu4c's extra's and does not seem to be in the icu4c or icu4c@76 brew installs.

Sure, I can take care of it.

You mean 'main' branch rather than latest release :- https://github.com/unicode-org/icu/releases/tag/release-76-1

No, the latest release

@lfoppiano
Copy link
Collaborator Author

@AaronNGray I did implement a build for all the libraries except for ICU so that we can get static libraries from an automatic build. I"m not sure this is the proper way to do it.
I though it could be a manual workflow that would build multiple archs automatically. What do you think?

@AaronNGray
Copy link

AaronNGray commented Feb 23, 2025

@lfoppiano yes we do it manually first. Then automate on manual triggers for proper releases, and maybe push triggered builds for bleeding edge build. Ideally we get each project writing automatically to their release page then pdfalto can do curl/wget the zipped binaries from them. We can probably utilize input fields on manually triggered builds for version input or input selection. So yes we have to get it all working manually first. I will see if I can get the ICU ARM64 76-1 builds done today.

@AaronNGray
Copy link

I did implement a build for all the libraries except for ICU so that we can get static libraries from an automatic build. I"m not sure this is the proper way to do it.

Oh I have just seen your https://github.com/kermitt2/pdfalto/blob/master/.github/workflows/ci-build.yml script, looks good, we might as well just add ICU too it, for now.

I just built and ran it on MacOS 15.3.1 and it run okay apart from a couple of warnings :-

% ./pdfalto ~/Documents/1809.01427.pdf
Config Error: Bad line in 'nameToUnicode' file (languages/xpdf-others/ligatures.nameToUnicode:5)
Config Error: Bad line in 'nameToUnicode' file (languages/xpdf-others/ligatures.nameToUnicode:6)
aarongray@Aarons-MacBook-Air pdfalto % ./pdfalto ~/Documents/gapayev-et-al-icfp2000.pdf
Config Error: Bad line in 'nameToUnicode' file (languages/xpdf-others/ligatures.nameToUnicode:5)
Config Error: Bad line in 'nameToUnicode' file (languages/xpdf-others/ligatures.nameToUnicode:6)

@AaronNGray
Copy link

AaronNGray commented Feb 23, 2025

@lfoppiano - ICU binaries - I have not tried linking and running them with pfdalto as of yet.

Not totally sure about the packaging directory stuctures, but heres the .zip's. I should make them .tar.gz's also.

MacOS/clang/arm64 :- https://github.com/AaronNGray/icu/actions/runs/13485968805/artifacts/2637738882
https://github.com/AaronNGray/icu/actions/runs/13485968805/job/37677138243#step:10:46

Ubuntu/gcc/arm64 :- https://github.com/AaronNGray/icu/actions/runs/13485968806/artifacts/2637740238
https://github.com/AaronNGray/icu/actions/runs/13485968806/job/37677138240#step:8:47

Repo branch :- https://github.com/AaronNGray/icu/tree/arm64

@AaronNGray
Copy link

I still need to understand some more stuff in order to streamline everything properly with GitHub Actions.

@lfoppiano
Copy link
Collaborator Author

@AaronNGray thanks. Is it OK if I integrate your build in this repository, I could check out the github repository or take just the release. I would like to just save the static library files we need, rather than the full packaged binaries.

@AaronNGray
Copy link

@lfoppiano Okay anything that works for you at this point. We can deal with doing proper releases and downloading and integrating the different projects separately later.

@AaronNGray
Copy link

@lfoppiano if you want an SSH account on a Ubuntu/ARM64 device for testing I can set you up one.

@lfoppiano
Copy link
Collaborator Author

lfoppiano commented Feb 24, 2025

Thanks @AaronNGray. Indeed having access to a Linux ARM would be useful for me. You can contact me at luca AT sciencialab.com

Meanwhile, I though that having separated jobs would be more flexible when will come to update libraries, but it's a bit of a hell to download the artifacts one by one and place them in the right directory. 😅 At least for now there are builds in all arch.

@AaronNGray
Copy link

@lfoppiano I have emailed you SSH details.

Meanwhile, I though that having separated jobs would be more flexible when will come to update libraries, but it's a bit of a hell to download the artifacts one by one and place them in the right directory. 😅 At least for now there are builds in all arch.

I think this can all be automated. I am trying to learn everything needed in order to do the integration.

@AaronNGray
Copy link

AaronNGray commented Feb 24, 2025

@lfoppiano This should hopefully give you access to the repo from within an action. I have not tested it obviously.

You will need to set the permissions first :-

https://graphite.dev/guides/github-actions-permissions#github-actions-object-object-permissions

action step :-

- name: Deploy Files
  run: |
    git config user.name ${{ secrets.GH_USER }}
    git config user.email "${{ secrets.GH_MAIL }}"
    git remote add gh-token "https://${{ secrets.GH_TOKEN}}@github.com/:user/:repo.git"
    git clone https://github.com/:user/:repo.git

from :-
https://github.com/orgs/community/discussions/26615#discussioncomment-3252542

There maybe a simpler direct way using the existing checkout rather than cloning and using the GH_TOKEN.

@lfoppiano
Copy link
Collaborator Author

That's great! Thanks!

@AaronNGray
Copy link

AaronNGray commented Feb 26, 2025

@lfoppiano - I am trying to work out what is going on.

Everything was building at this point :-
https://github.com/kermitt2/pdfalto/actions/runs/13515013958/job/37762022821

That was a "Manual build static libraries".

Then was broken, here :-
https://github.com/kermitt2/pdfalto/actions/runs/13515085195/job/37762209480

The commit :-
567440a

There seem to be several commits since :-
https://github.com/kermitt2/pdfalto/commits/feature/update-external-libraries/

I have cloned feature/update-external-libraries-merge-builds and its static build is building :-
https://github.com/AaronNGray/pdfalto/actions/runs/13549100441

but the push build is not :-
https://github.com/AaronNGray/pdfalto/actions/runs/13549100443

I cannot determine the difference !

@lfoppiano
Copy link
Collaborator Author

yes, when I wiped out the source of libpng and zlib it stopped working. It seems there are breaking changes in libPNG as I aded the updated sources it keep does not work. This should be fixed in a the next couple of days. I keep you posted.

@AaronNGray
Copy link

okay I will lay off doing anything for now. The old version had multiple possible buffer overruns IIRC. Ideally these will need checking for before a release.

@lfoppiano
Copy link
Collaborator Author

@AaronNGray I think libpng and zlib are working now, and there seems to be quite some issue to link the ICU library (the updated (we use 76-1), since you've seen it before, do you want to try to work on that?

@AaronNGray
Copy link

@lfoppiano - you have fixed all builds now :-

https://github.com/kermitt2/pdfalto/actions/runs/13598462749

@lfoppiano
Copy link
Collaborator Author

Yes, took me quite some time but we've made it 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants