cpu-inference

Star

Here are 26 public repositories matching this topic...

kennethleungty / Llama-2-Open-Source-LLM-CPU-Inference

Star

Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A

Updated Nov 6, 2023
Python

CoderLSF / fast-llama

Star

Runs LLaMA with Extremely HIGH speed

llama inference-engine cpu-inference llama2

Updated Nov 21, 2023
C++

rbitr / llm.f90

Star

LLM inference in Fortran

ai chatbot transformer llama language-model mamba state-space-model cpu-inference llm llamacpp llama2 phi-2

Updated May 30, 2024
Fortran

jozsefszalma / homelab

Star

The bare metal in my basement

machine-learning ai deep-learning server gpu unraid hobby-project gigabyte bare-metal homelab hpc-systems hardware-hacking supermicro cpu-inference cse-825 g292-z20

Updated Oct 20, 2025

lucienhuangfu / eLLM

Star

eLLM Infers LLM on CPUs in Real Time

llama cpu-inference deep-thinking llm-infernece deep-research context-engineering rust-llm

Updated Oct 17, 2025
Rust

yybit / pllm

Star

Portable LLM - A rust library for LLM inference

cpu-inference aigc llm llama2

Updated Apr 13, 2024
Rust

laelhalawani / gguf_llama

Star

Wrapper for simplified use of Llama2 GGUF quantized models.

llama quantization cpu-inference llamacpp llama2 gguf

Updated Jan 14, 2024
Python

JohnClaw / chatllm.v

Star

V-lang api wrapper for llm-inference chatllm.cpp

chatbot inference bindings api-wrapper llama quantization gemma mistral v-lang vlang cpu-inference llm llms chatllm ggml llm-inference qwen phi3

Updated Nov 20, 2024
C

lahcenkh / rag-network-docs

Star

Privacy-focused RAG chatbot for network documentation. Chat with your PDFs locally using Ollama, Chroma & LangChain. CPU-only, fully offline.

ai python3 network-programming cpu-inference vector-database-embedding rag-chatbot

Updated Sep 7, 2025
Python

codito / arey

Star

Simple large language model playground app

cli ai mistral cpu-inference large-language-models llm local-model llamacpp llama2 gguf

Updated Oct 26, 2025
Rust

JohnClaw / chatllm.vb

Star

VB.NET api wrapper for llm-inference chatllm.cpp

bindings api-wrapper llama vb-net vbnet gemma mistral int8 int8-inference int8-quantization cpu-inference chatllm ggml llm-inference qwen

Updated Nov 26, 2024
Visual Basic .NET

JohnClaw / chatllm.cs

Star

C# api wrapper for llm-inference chatllm.cpp

csharp inference bindings api-wrapper llama gemma mistral int8 int8-inference int8-quantization cpu-inference llm llms chatllm ggml llm-inference qwen

Updated Nov 20, 2024
C#

PlantAi is a ResNet-based CNN model trained on the PlantVillage dataset to classify plant leaf images as healthy or diseased. This repository includes PyTorch training code, tools to convert the model to TensorFlow Lite (TFLite) for deployment, and an Android app integrating the model for real-time leaf disease detection from camera images.

android java deep-neural-networks computer-vision deep-learning cnn image-classification resnet onnx pytoch cpu-inference tflight real-time-inference agriculture-ai

Updated Aug 21, 2025
Java

BjornMelin / local-llm-workbench

Star

🧠 A comprehensive toolkit for benchmarking, optimizing, and deploying local Large Language Models. Includes performance testing tools, optimized configurations for CPU/GPU/hybrid setups, and detailed guides to maximize LLM performance on your hardware.