Offline LLM-Based Educational App #825

Yannlagaf · 2025-09-14T16:44:11Z

Yannlagaf
Sep 14, 2025

Hello,

I am exploring the development of an offline educational mobile app for students in areas where data internet is not really accessible.

The app would allow students (Grade 6 to University) to download the courses of a single year.

Each pack would include a small LLM model (or adapter) that runs fully offline on mid-range Android smartphones.

Once downloaded, the app should work 100% offline (no cloud access required), with good performance and minimal latency.

i want the LLM to be able to answer questions based on the course material and help students solve exercises, with minimal to low hallucinations.

My question:

Is this technically feasible on typical mid-range smartphones used in countries where the average phone is (3-8 GB RAM, ~128-256 GB storage) ?

Which model architecture strategy (quantization, LoRA adapters, small fine-tuned model, etc.) would you recommend for this use case?

Thanks.

rasbt · 2025-09-14T17:14:32Z

rasbt
Sep 14, 2025
Maintainer

Hi there,
I am not a mobile developer, but I think that's exactly what small LLMs like Gemma 3 270M are developed for (https://github.com/rasbt/LLMs-from-scratch/tree/main/ch05/12_gemma3)

Before any fine-tuning, I would probably also consider RAG (without and then with quantization) here because the answers are strictly based on the documents you provide. And then after RAG, I'd probably continue pretraining and/or finetune and see whether it makes it better or worse.

0 replies

Vishwasgore · 2025-09-15T16:12:06Z

Vishwasgore
Sep 15, 2025

Yes, it’s definitely possible to run an offline learning app on mid-range Android phones (3–8 GB RAM). The trick is to use a small, efficient model. Models in the 1–3B range, quantized to 4-bit (like Llama-3-3B
, Mistral-3B
, or Gemma-2B
) can run fine; 7B models are heavier and may feel slow.

The smart setup is:

Ship one base model once, then add tiny LoRA adapters (course packs, just a few MB each).

Use a local retrieval system (RAG) so the model always refers to the actual textbook instead of guessing.

Run it all with llama.cpp
, which is optimized for phones.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Offline LLM-Based Educational App #825

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Offline LLM-Based Educational App #825

Uh oh!

Yannlagaf Sep 14, 2025

Replies: 2 comments

Uh oh!

rasbt Sep 14, 2025 Maintainer

Uh oh!

Vishwasgore Sep 15, 2025

Yannlagaf
Sep 14, 2025

rasbt
Sep 14, 2025
Maintainer

Vishwasgore
Sep 15, 2025