Skip to content

May Binary Update #1179

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 34 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
6610db5
add android support
AmSmart Mar 19, 2025
38f6bb3
add maui test project
AmSmart Mar 19, 2025
ded0f07
update workflow
AmSmart Mar 19, 2025
9f8863a
uncomment Android section of Gather Binaries
AmSmart Mar 19, 2025
2ac7871
add ggm-base and ggml-cpu
AmSmart Mar 19, 2025
bb528ba
update MSBuild for ggm-base and ggml-cpu
AmSmart Mar 19, 2025
2105dc3
add support for linux-arm64
nipeone Mar 27, 2025
bc4dde8
update compile.yml
nipeone Mar 27, 2025
aeef2eb
update compile.yml DGGML_CPU_ARM_ARCH=armv8-a
nipeone Mar 27, 2025
80d75d9
update runtime.targets
nipeone Mar 27, 2025
5996b40
Merge branch 'SciSharp:master' into master
nipeone Apr 2, 2025
6f8b7ce
Merge branch 'SciSharp:master' into master
nipeone Apr 7, 2025
31c1218
Merge branch 'SciSharp:master' into master
nipeone Apr 11, 2025
6becd43
Merge branch 'SciSharp:master' into master
nipeone Apr 18, 2025
472140e
create separate Android nuspec
AmSmart Apr 19, 2025
4421364
fix failing tests
AmSmart Apr 19, 2025
bc4bdf9
sync fork
AmSmart Apr 19, 2025
c62980f
Apply suggestions from code review
martindevans Apr 21, 2025
dfb3cc9
Update .github/workflows/compile.yml
martindevans Apr 21, 2025
47f90c4
Merge branch 'master' into master
martindevans Apr 21, 2025
45d1964
Changed for binary update to https://github.com/ggml-org/llama.cpp/co…
martindevans Apr 20, 2025
c06f458
Merge remote-tracking branch 'remotes/nipeone/master' into update_apr…
martindevans Apr 21, 2025
e3f9f6c
Merge remote-tracking branch 'remotes/AmSmart/feature/add-android-sup…
martindevans Apr 21, 2025
fab97c3
Removed Android x86 target, not compatible with some parts of llama.cpp
martindevans Apr 21, 2025
c51fae0
Removed Android x86
martindevans Apr 22, 2025
8aca6d3
Using new binaries
martindevans Apr 28, 2025
c334534
Updated to changes for `ceda28ef8e310a8dee60bf275077a3eedae8e36c`
martindevans Apr 30, 2025
98af2df
Updated to newer binaries (ceda28ef8e310a8dee60bf275077a3eedae8e36c)
martindevans May 1, 2025
67a3aea
update gitignore and add missing xml files
AmSmart May 1, 2025
4654c21
Merge pull request #9 from AmSmart/feature/add-android-support
martindevans May 1, 2025
1074fc7
Fixed spelling
martindevans May 1, 2025
63b9235
Removing llama.mobile on non-windows platforms
martindevans May 2, 2025
358e845
Skipping build of llama.mobile on non-windows platforms
martindevans May 2, 2025
0930060
Removed LLama.Mobile from main solution
martindevans May 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 62 additions & 36 deletions .github/workflows/compile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,13 +28,25 @@ jobs:
include:
- build: 'noavx'
defines: '-DGGML_AVX=OFF -DGGML_AVX2=OFF -DGGML_FMA=OFF'
os: ubuntu-24.04
arch: x64
- build: 'avx2'
defines: ''
os: ubuntu-24.04
arch: x64
- build: 'avx'
defines: '-DGGML_AVX2=OFF'
os: ubuntu-24.04
arch: x64
- build: 'avx512'
defines: '-DGGML_AVX512=ON'
runs-on: ubuntu-24.04
os: ubuntu-24.04
arch: x64
- build: 'aarch64'
defines: '-DGGML_NATIVE=OFF -DGGML_CPU_AARCH64=ON -DGGML_CPU_ARM_ARCH=armv8-a'
os: ubuntu-24.04-arm
arch: arm64
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
with:
Expand All @@ -52,28 +64,28 @@ jobs:
- uses: actions/upload-artifact@v4
with:
path: ./build/bin/libllama.so
name: llama-bin-linux-${{ matrix.build }}-x64.so
name: llama-bin-linux-${{ matrix.build }}-${{ matrix.arch }}.so
if-no-files-found: error
- uses: actions/upload-artifact@v4
with:
path: ./build/bin/libggml.so
name: ggml-bin-linux-${{ matrix.build }}-x64.so
name: ggml-bin-linux-${{ matrix.build }}-${{ matrix.arch }}.so
if-no-files-found: error
- uses: actions/upload-artifact@v4
with:
path: ./build/bin/libggml-base.so
name: ggml-base-bin-linux-${{ matrix.build }}-x64.so
name: ggml-base-bin-linux-${{ matrix.build }}-${{ matrix.arch }}.so
if-no-files-found: error
- uses: actions/upload-artifact@v4
with:
path: ./build/bin/libggml-cpu.so
name: ggml-cpu-bin-linux-${{ matrix.build }}-x64.so
name: ggml-cpu-bin-linux-${{ matrix.build }}-${{ matrix.arch }}.so
if-no-files-found: error
- name: Upload Llava
uses: actions/upload-artifact@v4
with:
path: ./build/bin/libllava_shared.so
name: llava-bin-linux-${{ matrix.build }}-x64.so
name: llava-bin-linux-${{ matrix.build }}-${{ matrix.arch }}.so
if-no-files-found: error

compile-musl:
Expand Down Expand Up @@ -527,19 +539,15 @@ jobs:
if-no-files-found: error

compile-android:
# Disable android build
if: false

name: Compile (Android)
strategy:
fail-fast: true
matrix:
include:
- build: 'x86'
defines: '-DANDROID_ABI=x86'
- build: 'x86_64'
defines: '-DANDROID_ABI=x86_64'
defines: '-DANDROID_ABI=x86_64 -DCMAKE_C_FLAGS=-march=x86-64 -DCMAKE_CXX_FLAGS=-march=x86-64'
- build: 'arm64-v8a'
defines: '-DANDROID_ABI=arm64-v8a'
defines: '-DANDROID_ABI=arm64-v8a -DCMAKE_C_FLAGS=-march=armv8.7a -DCMAKE_C_FLAGS=-march=armv8.7a'
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v4
Expand All @@ -555,28 +563,39 @@ jobs:
- name: Build
id: cmake_build
env:
CMAKE_FLAGS: '-DCMAKE_TOOLCHAIN_FILE=${{ steps.setup-ndk.outputs.ndk-path }}/build/cmake/android.toolchain.cmake -DANDROID_PLATFORM=android-23'
CMAKE_FLAGS: '-DCMAKE_TOOLCHAIN_FILE=${{ steps.setup-ndk.outputs.ndk-path }}/build/cmake/android.toolchain.cmake -DANDROID_PLATFORM=android-23 -DGGML_OPENMP=OFF -DGGML_LLAMAFILE=OFF'
run: |
mkdir build
cd build
cmake .. ${{ env.COMMON_DEFINE }} ${{ env.CMAKE_FLAGS }} ${{ matrix.defines }}
cmake --build . --config Release -j ${env:NUMBER_OF_PROCESSORS}
cd ..
ls -R
# export-lora not supported on 32 bit machines hence breaks x86 build
sed -i '/add_subdirectory(export-lora)/d' examples/CMakeLists.txt # remove export-lora from examples
cmake ${{ env.COMMON_DEFINE }} ${{ env.CMAKE_FLAGS }} ${{ matrix.defines }} -B build
cmake --build build --config Release -j ${env:NUMBER_OF_PROCESSORS}
- name: Upload Llama
uses: actions/upload-artifact@v4
with:
path: ./build/src/libllama.so
path: ./build/bin/libllama.so
name: llama-bin-android-${{ matrix.build }}.so
- uses: actions/upload-artifact@v4
- name: Upload GGML
uses: actions/upload-artifact@v4
with:
path: ./build/ggml/src/libggml.so
path: ./build/bin/libggml.so
name: ggml-bin-android-${{ matrix.build }}.so
if-no-files-found: error
- name: Upload GGML Base
uses: actions/upload-artifact@v4
with:
path: ./build/bin/libggml-base.so
name: ggml-base-bin-android-${{ matrix.build }}.so
if-no-files-found: error
- name: Upload GGML CPU
uses: actions/upload-artifact@v4
with:
path: ./build/bin/libggml-cpu.so
name: ggml-cpu-bin-android-${{ matrix.build }}.so
if-no-files-found: error
- name: Upload Llava
uses: actions/upload-artifact@v4
with:
path: ./build/examples/llava/libllava_shared.so
path: ./build/bin/libllava_shared.so
name: llava-bin-android-${{ matrix.build }}.so

build-deps:
Expand All @@ -601,7 +620,7 @@ jobs:
- name: Rearrange Files
run: |
# Make all directories at once
mkdir --parents deps/{noavx,avx,avx2,avx512,musl-noavx,musl-avx,musl-avx2,musl-avx512,osx-arm64,osx-x64,osx-x64-rosetta2,cu11.7.1,cu12.2.0,vulkan,android-arm64-v8a,android-x86,android-x86_64}
mkdir --parents deps/{noavx,avx,avx2,avx512,linux-arm64,musl-noavx,musl-avx,musl-avx2,musl-avx512,osx-arm64,osx-x64,osx-x64-rosetta2,cu11.7.1,cu12.2.0,vulkan,android-arm64-v8a,android-x86,android-x86_64}

# Linux
cp artifacts/ggml-bin-linux-noavx-x64.so/libggml.so deps/noavx/libggml.so
Expand All @@ -628,6 +647,13 @@ jobs:
cp artifacts/llama-bin-linux-avx512-x64.so/libllama.so deps/avx512/libllama.so
cp artifacts/llava-bin-linux-avx512-x64.so/libllava_shared.so deps/avx512/libllava_shared.so

# Arm64
cp artifacts/ggml-bin-linux-aarch64-arm64.so/libggml.so deps/linux-arm64/libggml.so
cp artifacts/ggml-base-bin-linux-aarch64-arm64.so/libggml-base.so deps/linux-arm64/libggml-base.so
cp artifacts/ggml-cpu-bin-linux-aarch64-arm64.so/libggml-cpu.so deps/linux-arm64/libggml-cpu.so
cp artifacts/llama-bin-linux-aarch64-arm64.so/libllama.so deps/linux-arm64/libllama.so
cp artifacts/llava-bin-linux-aarch64-arm64.so/libllava_shared.so deps/linux-arm64/libllava_shared.so

# Musl
cp artifacts/ggml-bin-musl-noavx-x64.so/libggml.so deps/musl-noavx/libggml.so
cp artifacts/ggml-base-bin-musl-noavx-x64.so/libggml-base.so deps/musl-noavx/libggml-base.so
Expand Down Expand Up @@ -703,17 +729,17 @@ jobs:
cp artifacts/llava-bin-osx-x64-rosetta2.dylib/libllava_shared.dylib deps/osx-x64-rosetta2/libllava_shared.dylib

# Android
#cp artifacts/ggml-bin-android-arm64-v8a.so/libggml.so deps/android-arm64-v8a/libggml.so
#cp artifacts/llama-bin-android-arm64-v8a.so/libllama.so deps/android-arm64-v8a/libllama.so
#cp artifacts/llava-bin-android-arm64-v8a.so/libllava_shared.so deps/android-arm64-v8a/libllava_shared.so

#cp artifacts/ggml-bin-android-x86.so/libggml.so deps/android-x86/libggml.so
#cp artifacts/llama-bin-android-x86.so/libllama.so deps/android-x86/libllama.so
#cp artifacts/llava-bin-android-x86.so/libllava_shared.so deps/android-x86/libllava_shared.so

#cp artifacts/ggml-bin-android-x86_64.so/libggml.so deps/android-x86_64/libggml.so
#cp artifacts/llama-bin-android-x86_64.so/libllama.so deps/android-x86_64/libllama.so
#cp artifacts/llava-bin-android-x86_64.so/libllava_shared.so deps/android-x86_64/libllava_shared.so
cp artifacts/ggml-bin-android-arm64-v8a.so/libggml.so deps/android-arm64-v8a/libggml.so
cp artifacts/ggml-base-bin-android-arm64-v8a.so/libggml-base.so deps/android-arm64-v8a/libggml-base.so
cp artifacts/ggml-cpu-bin-android-arm64-v8a.so/libggml-cpu.so deps/android-arm64-v8a/libggml-cpu.so
cp artifacts/llama-bin-android-arm64-v8a.so/libllama.so deps/android-arm64-v8a/libllama.so
cp artifacts/llava-bin-android-arm64-v8a.so/libllava_shared.so deps/android-arm64-v8a/libllava_shared.so
cp artifacts/ggml-bin-android-x86_64.so/libggml.so deps/android-x86_64/libggml.so
cp artifacts/ggml-base-bin-android-x86_64.so/libggml-base.so deps/android-x86_64/libggml-base.so
cp artifacts/ggml-cpu-bin-android-x86_64.so/libggml-cpu.so deps/android-x86_64/libggml-cpu.so
cp artifacts/llama-bin-android-x86_64.so/libllama.so deps/android-x86_64/libllama.so
cp artifacts/llava-bin-android-x86_64.so/libllava_shared.so deps/android-x86_64/libllava_shared.so

# Windows CUDA
cp artifacts/ggml-bin-win-cublas-cu11.7.1-x64.dll/ggml.dll deps/cu11.7.1/ggml.dll
Expand Down
9 changes: 9 additions & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,15 @@ jobs:
with:
dotnet-version: |
8.0.x
- name: Install Mobile Workloads
if: ${{ contains(runner.os, 'windows') }}
run: |
dotnet workload install android --ignore-failed-sources
dotnet workload install maui --ignore-failed-sources
- name: Remove Mobile Project
if: ${{ !contains(runner.os, 'windows') }}
run: |
dotnet sln LLamaSharp.sln remove Llama.Mobile
- name: Cache Packages
uses: actions/cache@v4
with:
Expand Down
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -337,7 +337,6 @@ test/TensorFlowNET.Examples/mnist
# training model resources
.resources
/redist
*.xml
*.xsd

# docs
Expand Down
112 changes: 112 additions & 0 deletions LLama/LLamaSharp.Runtime.targets
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,28 @@
</None>


<None Include="$(MSBuildThisFileDirectory)runtimes/deps/linux-arm64/libllama.so">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Link>runtimes/linux-arm64/native/libllama.so</Link>
</None>
<None Include="$(MSBuildThisFileDirectory)runtimes/deps/linux-arm64/libggml.so">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Link>runtimes/linux-arm64/native/libggml.so</Link>
</None>
<None Include="$(MSBuildThisFileDirectory)runtimes/deps/linux-arm64/libggml-base.so">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Link>runtimes/linux-arm64/native/libggml-base.so</Link>
</None>
<None Include="$(MSBuildThisFileDirectory)runtimes/deps/linux-arm64/libggml-cpu.so">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Link>runtimes/linux-arm64/native/libggml-cpu.so</Link>
</None>
<None Include="$(MSBuildThisFileDirectory)runtimes/deps/linux-arm64/libllava_shared.so">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Link>runtimes/linux-arm64/native/libllava_shared.so</Link>
</None>


<None Include="$(MSBuildThisFileDirectory)runtimes/deps/cu11.7.1/libllama.so">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Link>runtimes/linux-x64/native/cuda11/libllama.so</Link>
Expand Down Expand Up @@ -466,4 +488,94 @@
<Link>runtimes/linux-x64/native/vulkan/libllava_shared.so</Link>
</None>
</ItemGroup>

<!-- Android Native Libs (Start) -->
<ItemGroup
Condition="$(AndroidSupportedAbis.Contains('x86')) or $(RuntimeIdentifiers.Contains('android-x86'))">
<AndroidNativeLibrary Visible="false"
Include="$(MSBuildThisFileDirectory)runtimes/deps/android-x86/libllama.so">
<Link>runtimes/android-x86/native/libllama.so</Link>
<Abi>x86</Abi>
</AndroidNativeLibrary>
<AndroidNativeLibrary Visible="false"
Include="$(MSBuildThisFileDirectory)runtimes/deps/android-x86/libggml.so">
<Link>runtimes/android-x86/native/libggml.so</Link>
<Abi>x86</Abi>
</AndroidNativeLibrary>
<AndroidNativeLibrary Visible="false"
Include="$(MSBuildThisFileDirectory)runtimes/deps/android-x86/libggml-base.so">
<Link>runtimes/android-x86/native/libggml-base.so</Link>
<Abi>x86</Abi>
</AndroidNativeLibrary>
<AndroidNativeLibrary Visible="false"
Include="$(MSBuildThisFileDirectory)runtimes/deps/android-x86/libggml-cpu.so">
<Link>runtimes/android-x86/native/libggml-cpu.so</Link>
<Abi>x86</Abi>
</AndroidNativeLibrary>
<AndroidNativeLibrary Visible="false"
Include="$(MSBuildThisFileDirectory)runtimes/deps/android-x86/libllava_shared.so">
<Link>runtimes/android-x86/native/libllava_shared.so</Link>
<Abi>x86</Abi>
</AndroidNativeLibrary>
</ItemGroup>

<ItemGroup
Condition="$(AndroidSupportedAbis.Contains('x86_64')) or $(RuntimeIdentifiers.Contains('android-x64'))">
<AndroidNativeLibrary Visible="false"
Include="$(MSBuildThisFileDirectory)runtimes/deps/android-x86_64/libllama.so">
<Link>lib/x86_64/libllama.so</Link>
<Abi>x86_64</Abi>
</AndroidNativeLibrary>
<AndroidNativeLibrary Visible="false"
Include="$(MSBuildThisFileDirectory)runtimes/deps/android-x86_64/libggml.so">
<Link>lib/x86_64/libggml.so</Link>
<Abi>x86_64</Abi>
</AndroidNativeLibrary>
<AndroidNativeLibrary Visible="false"
Include="$(MSBuildThisFileDirectory)runtimes/deps/android-x86_64/libggml-base.so">
<Link>lib/x86_64/libggml-base.so</Link>
<Abi>x86_64</Abi>
</AndroidNativeLibrary>
<AndroidNativeLibrary Visible="false"
Include="$(MSBuildThisFileDirectory)runtimes/deps/android-x86_64/libggml-cpu.so">
<Link>lib/x86_64/libggml-cpu.so</Link>
<Abi>x86_64</Abi>
</AndroidNativeLibrary>
<AndroidNativeLibrary Visible="false"
Include="$(MSBuildThisFileDirectory)runtimes/deps/android-x86_64/libllava_shared.so">
<Link>lib/x86_64/libllava_shared.so</Link>
<Abi>x86_64</Abi>
</AndroidNativeLibrary>
</ItemGroup>

<ItemGroup
Condition="$(AndroidSupportedAbis.Contains('arm64-v8a')) or $(RuntimeIdentifiers.Contains('android-arm64'))">
<AndroidNativeLibrary Visible="false"
Include="$(MSBuildThisFileDirectory)runtimes/deps/android-arm64-v8a/libllama.so">
<Link>lib/arm64-v8a/libllama.so</Link>
<Abi>arm64-v8a</Abi>
</AndroidNativeLibrary>
<AndroidNativeLibrary Visible="false"
Include="$(MSBuildThisFileDirectory)runtimes/deps/android-arm64-v8a/libggml.so">
<Link>lib/arm64-v8a/libggml.so</Link>
<Abi>arm64-v8a</Abi>
</AndroidNativeLibrary>
<AndroidNativeLibrary Visible="false"
Include="$(MSBuildThisFileDirectory)runtimes/deps/android-arm64-v8a/libggml-base.so">
<Link>lib/arm64-v8a/libggml-base.so</Link>
<Abi>arm64-v8a</Abi>
</AndroidNativeLibrary>
<AndroidNativeLibrary Visible="false"
Include="$(MSBuildThisFileDirectory)runtimes/deps/android-arm64-v8a/libggml-cpu.so">
<Link>lib/arm64-v8a/libggml-cpu.so</Link>
<Abi>arm64-v8a</Abi>
</AndroidNativeLibrary>
<AndroidNativeLibrary Visible="false"
Include="$(MSBuildThisFileDirectory)runtimes/deps/android-arm64-v8a/libllava_shared.so">
<Link>lib/arm64-v8a/libllava_shared.so</Link>
<Abi>arm64-v8a</Abi>
</AndroidNativeLibrary>
</ItemGroup>
<!-- Android Native Libs (End) -->

</Project>
2 changes: 1 addition & 1 deletion LLama/LLamaSharp.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@
</ItemGroup>

<PropertyGroup>
<BinaryReleaseId>be7c3034108473be</BinaryReleaseId>
<BinaryReleaseId>ceda28ef8e310a8de</BinaryReleaseId>
</PropertyGroup>

<PropertyGroup>
Expand Down
5 changes: 5 additions & 0 deletions LLama/Native/LLamaModelParams.cs
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,11 @@ public unsafe struct LLamaModelParams
/// todo: add support for llama_model_params.devices
/// </summary>
private IntPtr devices;

// NULL-terminated list of buffer types to use for tensors that match a pattern
// actual type: llama_model_tensor_buft_override*
// todo: add support for tensor_buft_overrides
private IntPtr tensor_buft_overrides;

/// <summary>
/// // number of layers to store in VRAM
Expand Down
5 changes: 5 additions & 0 deletions LLama/Native/LLamaModelQuantizeParams.cs
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,11 @@ public bool keep_split
/// </summary>
public IntPtr kv_overrides;

/// <summary>
/// pointer to vector containing tensor types
/// </summary>
public IntPtr tensor_types;

/// <summary>
/// Create a LLamaModelQuantizeParams with default values
/// </summary>
Expand Down
5 changes: 5 additions & 0 deletions LLama/Native/LLamaVocabPreType.cs
Original file line number Diff line number Diff line change
Expand Up @@ -38,5 +38,10 @@ internal enum LLamaVocabPreType
MINERVA = 27,
DEEPSEEK3_LLM = 28,
GPT4O = 29,
SUPERBPE = 30,
TRILLION = 31,
BAILINGMOE = 32,
LLAMA4 = 33,
PIXTRAL = 34,
}
// ReSharper restore InconsistentNaming
Loading
Loading