Is this a real engagement?

No. This is an illustrative scenario showing how a typical 2-day Local AI workshop plays out for a Thai in-house team. We wrote it because potential clients evaluating a workshop deserve to see concretely what two days delivers. Real workshops with client consent will be posted separately.

What minimum skill level do participants need?

Comfort with Linux command-line, basic Docker or container experience, and one modern programming language. We can adjust depth for strong Python teams or backend teams with little ML exposure. We do not expect prior ML or LLM experience.

Does every participant need their own GPU?

Ideally yes — a 12-24 GB card each for hands-on exercises. If that is not feasible we can run with shared GPUs (2-4 people per card) or rented cloud GPUs for the 2 days. In-house hardware gives a better learning experience because participants walk away with their rig still running.

Can the workshop be customized for our specific use case?

Yes. The standard 2-day curriculum covers fundamentals that apply everywhere. A customized workshop — say, for a healthcare team focused on PDPA compliance, or a fintech team focused on RAG over internal docs — costs slightly more and requires a 2-week prep period to tune exercises to your actual data and stack.

What happens after the workshop? Are we on our own?

Workshop fee includes 30 days of post-workshop email support. Teams often bolt on a 3-6 month optional retainer (~5,000-10,000 THB/month) for ongoing questions, model-update reviews, and tuning as they roll out internal use cases. Many teams do not need this after three months; they just run it themselves.

Case Study: A 2-Day Local AI Workshop for a Thai In-House Tech Team

TL;DR: Illustrative scenario — a Thai tech company tired of agency dependency sends 10 IT/data engineers to a 2-day hands-on workshop. Day 1 covers hardware selection, Ollama installation, model choice, and basic operations. Day 2 builds a custom RAG chatbot on their own documents. By day 2 afternoon they can maintain a Local AI stack independently — total workshop cost roughly what one month of an agency retainer runs.

Key facts

This is an illustrative scenario, not a real engagement. The company described is hypothetical.
Standard workshop is 2 days (16 hours total), 8 hours per day including a proper lunch break.
Group size: minimum 4 participants, maximum 10. Above 10 quality drops; we open a second session instead.
Default delivery: in-house at the client’s office in Bangkok metro (no travel cost); outside metro is a separate quote.
Cost typically 15,000 THB per person for open-enrolment or 60,000 THB per day for in-house (up to 10 participants).
Participants walk away with: working Local AI stack on their own machine, starter-code repo, a custom RAG demo over their own documents, and 30 days of email support.
About 80% of participating teams no longer need external support after 3 months.

Why this case study exists

Transparency disclosure, as with our other cases: this is not a real engagement. No Thai tech team hired us for exactly this workshop. We wrote it because “what actually happens in a Local AI workshop?” is the question teams ask before committing, and a vague “2 days hands-on” line on a service page does not answer it. Real workshops with client consent will be published separately.

Workshops are the package we get the most unprompted questions about, because teams want to see the scope and pace before they approve spending. This case study walks through a typical workshop minute by minute.

The scenario (illustrative)

A mid-size Thai tech company (a B2B SaaS doing logistics, roughly 100 employees, 12 on the engineering team). The CTO wants the company to run its own internal AI tools — summary of support tickets, Q&A over internal runbooks, code assistance — rather than keep sending everything to OpenAI’s API. They currently spend about 25,000 THB per month on cloud AI calls plus another 40,000 THB per month on a local AI agency that maintains an Ollama server they do not fully understand. Both costs grate: the first because customer-support tickets include PDPA-sensitive user data that probably should not leave the country; the second because they are paying premium for something their team feels they should be able to do themselves.

The CTO books a 2-day in-house workshop for 10 people: the 4 backend engineers (for integration), 2 data engineers (for data pipelines), 2 infrastructure engineers (for ops), 1 security engineer (for review), and 1 product manager (for scope understanding).

Why this matters for Thai tech teams in 2026

Local AI in 2026 is no longer an exotic setup — Ollama runs reliably on any recent GPU, Qwen3 and Gemma offer production-quality open weights, and the documentation has matured enough that a competent engineer can get a basic system running in a day. But “basic system running” is different from “team can confidently own this stack long-term”. The gap between those two states is what workshops close in 2 days.

The alternative — learning by trial and error while running production — is what most teams actually do, and it costs 2-3x the workshop budget in wasted time and half-working deployments that nobody wants to maintain.

The constraints we would work within

Time: 2 days, no more. Engineers cannot take a week off for training.
Heterogeneous backgrounds: the 10 participants have different ML exposure from none to intermediate. Material must work for the median without boring the strongest or losing the weakest.
Their stack: no point teaching tools they won’t use. If they run AWS, we demo on cloud GPUs; if they run on-prem, we demo on their hardware.
Post-workshop self-sufficiency: the main goal is that 3 months later the team does not need to call us. Teaching them to fish, not giving them fish.

What the 2-day agenda actually looks like

Day 1 — Fundamentals and installation

Morning (9:00-12:00): Why Local AI and how the pieces fit

Why self-host: cost, privacy, control — and when it is the wrong answer
Mental model of a local LLM stack: weights, inference runtime, API, client
The landscape in 2026: Ollama vs. vLLM vs. llama.cpp, when to pick which
Open weights vs. open source; licence gotchas for commercial use (Apache 2.0, Gemma terms, Qwen licence)

Participants follow along installing Ollama on their own machines during this block.

Afternoon (13:00-17:00): Your first running model

Downloading Qwen3-8B (small enough for everyone’s laptop) and chatting with it via CLI and Open WebUI
Understanding parameters: temperature, top_p, context window, num_predict
Benchmarking tokens/sec on their specific hardware — real numbers, not claims
When to pick 8B vs. 32B vs. 70B for a given task; how to tell when you need to step up
Homework: everyone runs the same 5 test prompts on their rig overnight and brings results to day 2

Day 2 — Build something real

Morning (9:00-12:00): RAG from scratch

What RAG is and why it beats fine-tuning for 90% of “teach the model about our data” problems
Chunking strategies for different document types (code, long-form docs, Q&A pairs)
Embedding models: nomic-embed-text, bge-m3, when to pick each
Vector databases in 2026: Qdrant, Weaviate, Chroma — the quick comparison
Building a working RAG pipeline in ~100 lines of Python that queries their own documents

Afternoon (13:00-16:00): Putting it in production

A realistic deployment architecture: Ollama behind nginx, auth via bearer token, logging
Monitoring: what to log, what to alert on, when a model update is going wrong
Common failure modes: VRAM OOM, context overflow, prompt drift after model updates, CUDA breakage
Q&A on their specific use cases (bring your own problem)

Last hour (16:00-17:00): Consolidation and next steps

Who on the team will own each piece
Recommended 30-60-90 day roadmap for the team
How to audit their stack quarterly
Resources and the editor’s contact for follow-up

Expected outcomes

Honest framing — describing what workshops typically deliver, not a specific promise:

Day 1 end: every participant has Ollama + Qwen3-8B running on their laptop, chatting in Thai and English.
Day 2 end: a working RAG demo that answers questions from the team’s own internal documents; each participant has a starter repo they can fork into real projects.
30 days out: at least one production use case deployed inside the company; most commonly an internal Q&A bot on their own docs.
3 months out: the team runs a shared GPU server, integrates Local AI into 2-3 workflows, and has either cancelled cloud-AI contracts or reduced them by 60-80%.
What it does not do: turn a backend engineer into an ML engineer. The goal is operations and pragmatic use, not training models from scratch or publishing papers.

Common objections

“Two days is not enough.” For depth, no. For operational self-sufficiency with a supervised follow-up, yes — it is what we have consistently seen in similar international workshops. The goal is not expertise; it is enough competence to own the stack.

“Our team is too senior for workshop material.” We have rarely found this to be true in practice. Engineers who think they know LLM ops discover within the first 2 hours that their mental models have gaps (tokenisation edge cases, quantisation tradeoffs, licensing rules). If the team genuinely is senior on this, we pivot day 2 to advanced topics: fine-tuning, multi-GPU, custom embeddings.

“Can we record sessions for people who miss?” Yes. Recording is included at no extra cost; however, non-attendees catch up to about 70% of attendee benefit from recordings alone. Hands-on exercises do not transfer through video.

“Can you teach in Thai instead of English?” Yes. The default is Thai with English technical terms. Pure-English delivery available if the team is international.

Who this pattern fits

In-house tech teams (5-15 engineers) who want to own Local AI long-term rather than outsource perpetually
Agencies looking to add Local AI to their service offering without hiring specialists
University CS programs or bootcamps (rate negotiable, typically reduced for education)
Government agencies needing sovereign AI capability for PDPA-sensitive workloads

Does not fit: teams of 1-2 (open-enrolment workshop is better value), teams needing a production deployment today (hire the setup package, not the workshop), and teams who want a certification rather than skills.

How to engage

The starting point is a 30-minute call to scope team size, skill mix, and custom content. Email editor name with a rough number of participants and the problems you want them to learn to solve.

Workshop details on our services page; editorial operating principles on the standards page.