Skip to content

ohm314/slm_embedding_demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Proof-of-Concept demo for using SLMs on CPU-only compute

Swiss law search engine

Screenshot of Swiss law search engine demo streamlit interface

This is simple example to show how smaller LLM models (or small language models, SLMs) can be used as a viable alternative to their larger and much more performance-hungry siblings, when compute resources are constrained.

Here I present a simple but functional search engine for the (almost) complete legal texts of the Swiss confederation in German, French, Italian and English that can be hosted on CPU-only cloud compute. For instance an Oracle Cloud Infrastructure (OCI) VM.Standard.E6.Flex shape with 8 oCPUs and 32GB memory. No GPUs required!

The application uses in the backend ChromaDB as a vector database. When starting up the server, the entire legal corpus is read (from XML files) and parsed, vector embeddings are created using the Qwen-3-Embedding-4B model and stored in the database.

Overview ChromaDB-based search engine

The streamlit frontent presents a simple web interface with a search bar. Query text is passed through the same model for similarity search and the top 10 hits are returned.

Setup instructions

  • Get some system dependencies, on ubuntu this should suffice:
sudo apt install git tmux htop \
    build-essential gcc-12 g++-12 cmake libcurl4-openssl-dev
  • Use uv for easy deployment:
curl -LsSf https://astral.sh/uv/install.sh | sh
  • Get the code:
git clone https://github.com/ohm314/slm_embedding_demo.git
cd slm_embedding_demo
  • run the streamlit app:
uv run streamlit run src/local_server.py -- data/xml -p -q

To access the app you will either have already deployed your VM in a public subnet with firewall rules adjusted to have the streamlit port open, or you do some ssh port forwarding to expose the streamlit port locally.

License

Copyright (c) 2025 Omar Awile Licensed under the Universal Permissive License v 1.0 as shown at https://oss.oracle.com/licenses/upl/

About

Proof-of-Concept demo for using SLMs on CPU-only compute

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages