Case Study: A 2-Day Local AI Workshop for a Thai In-House Tech Team
Illustrative scenario — how a mid-size Thai tech company's 10-person IT team could stop paying agency retainers and run their own Local AI through a 2-day intensive workshop.
TL;DR: Illustrative scenario — a Thai tech company tired of agency dependency sends 10 IT/data engineers to a 2-day hands-on workshop. Day 1 covers hardware selection, Ollama installation, model choice, and basic operations. Day 2 builds a custom RAG chatbot on their own documents. By day 2 afternoon they can maintain a Local AI stack independently — total workshop cost roughly what one month of an agency retainer runs.
Key facts
- This is an illustrative scenario, not a real engagement. The company described is hypothetical.
- Standard workshop is 2 days (16 hours total), 8 hours per day including a proper lunch break.
- Group size: minimum 4 participants, maximum 10. Above 10 quality drops; we open a second session instead.
- Default delivery: in-house at the client’s office in Bangkok metro (no travel cost); outside metro is a separate quote.
- Cost typically 15,000 THB per person for open-enrolment or 60,000 THB per day for in-house (up to 10 participants).
- Participants walk away with: working Local AI stack on their own machine, starter-code repo, a custom RAG demo over their own documents, and 30 days of email support.
- About 80% of participating teams no longer need external support after 3 months.
Why this case study exists
Transparency disclosure, as with our other cases: this is not a real engagement. No Thai tech team hired us for exactly this workshop. We wrote it because “what actually happens in a Local AI workshop?” is the question teams ask before committing, and a vague “2 days hands-on” line on a service page does not answer it. Real workshops with client consent will be published separately.
Workshops are the package we get the most unprompted questions about, because teams want to see the scope and pace before they approve spending. This case study walks through a typical workshop minute by minute.
The scenario (illustrative)
A mid-size Thai tech company (a B2B SaaS doing logistics, roughly 100 employees, 12 on the engineering team). The CTO wants the company to run its own internal AI tools — summary of support tickets, Q&A over internal runbooks, code assistance — rather than keep sending everything to OpenAI’s API. They currently spend about 25,000 THB per month on cloud AI calls plus another 40,000 THB per month on a local AI agency that maintains an Ollama server they do not fully understand. Both costs grate: the first because customer-support tickets include PDPA-sensitive user data that probably should not leave the country; the second because they are paying premium for something their team feels they should be able to do themselves.
The CTO books a 2-day in-house workshop for 10 people: the 4 backend engineers (for integration), 2 data engineers (for data pipelines), 2 infrastructure engineers (for ops), 1 security engineer (for review), and 1 product manager (for scope understanding).
Why this matters for Thai tech teams in 2026
Local AI in 2026 is no longer an exotic setup — Ollama runs reliably on any recent GPU, Qwen3 and Gemma offer production-quality open weights, and the documentation has matured enough that a competent engineer can get a basic system running in a day. But “basic system running” is different from “team can confidently own this stack long-term”. The gap between those two states is what workshops close in 2 days.
The alternative — learning by trial and error while running production — is what most teams actually do, and it costs 2-3x the workshop budget in wasted time and half-working deployments that nobody wants to maintain.
The constraints we would work within
- Time: 2 days, no more. Engineers cannot take a week off for training.
- Heterogeneous backgrounds: the 10 participants have different ML exposure from none to intermediate. Material must work for the median without boring the strongest or losing the weakest.
- Their stack: no point teaching tools they won’t use. If they run AWS, we demo on cloud GPUs; if they run on-prem, we demo on their hardware.
- Post-workshop self-sufficiency: the main goal is that 3 months later the team does not need to call us. Teaching them to fish, not giving them fish.
What the 2-day agenda actually looks like
Day 1 — Fundamentals and installation
Morning (9:00-12:00): Why Local AI and how the pieces fit
- Why self-host: cost, privacy, control — and when it is the wrong answer
- Mental model of a local LLM stack: weights, inference runtime, API, client
- The landscape in 2026: Ollama vs. vLLM vs. llama.cpp, when to pick which
- Open weights vs. open source; licence gotchas for commercial use (Apache 2.0, Gemma terms, Qwen licence)
Participants follow along installing Ollama on their own machines during this block.
Afternoon (13:00-17:00): Your first running model
- Downloading Qwen3-8B (small enough for everyone’s laptop) and chatting with it via CLI and Open WebUI
- Understanding parameters: temperature, top_p, context window, num_predict
- Benchmarking tokens/sec on their specific hardware — real numbers, not claims
- When to pick 8B vs. 32B vs. 70B for a given task; how to tell when you need to step up
- Homework: everyone runs the same 5 test prompts on their rig overnight and brings results to day 2
Day 2 — Build something real
Morning (9:00-12:00): RAG from scratch
- What RAG is and why it beats fine-tuning for 90% of “teach the model about our data” problems
- Chunking strategies for different document types (code, long-form docs, Q&A pairs)
- Embedding models: nomic-embed-text, bge-m3, when to pick each
- Vector databases in 2026: Qdrant, Weaviate, Chroma — the quick comparison
- Building a working RAG pipeline in ~100 lines of Python that queries their own documents
Afternoon (13:00-16:00): Putting it in production
- A realistic deployment architecture: Ollama behind nginx, auth via bearer token, logging
- Monitoring: what to log, what to alert on, when a model update is going wrong
- Common failure modes: VRAM OOM, context overflow, prompt drift after model updates, CUDA breakage
- Q&A on their specific use cases (bring your own problem)
Last hour (16:00-17:00): Consolidation and next steps
- Who on the team will own each piece
- Recommended 30-60-90 day roadmap for the team
- How to audit their stack quarterly
- Resources and the editor’s contact for follow-up
Expected outcomes
Honest framing — describing what workshops typically deliver, not a specific promise:
- Day 1 end: every participant has Ollama + Qwen3-8B running on their laptop, chatting in Thai and English.
- Day 2 end: a working RAG demo that answers questions from the team’s own internal documents; each participant has a starter repo they can fork into real projects.
- 30 days out: at least one production use case deployed inside the company; most commonly an internal Q&A bot on their own docs.
- 3 months out: the team runs a shared GPU server, integrates Local AI into 2-3 workflows, and has either cancelled cloud-AI contracts or reduced them by 60-80%.
- What it does not do: turn a backend engineer into an ML engineer. The goal is operations and pragmatic use, not training models from scratch or publishing papers.
Common objections
“Two days is not enough.” For depth, no. For operational self-sufficiency with a supervised follow-up, yes — it is what we have consistently seen in similar international workshops. The goal is not expertise; it is enough competence to own the stack.
“Our team is too senior for workshop material.” We have rarely found this to be true in practice. Engineers who think they know LLM ops discover within the first 2 hours that their mental models have gaps (tokenisation edge cases, quantisation tradeoffs, licensing rules). If the team genuinely is senior on this, we pivot day 2 to advanced topics: fine-tuning, multi-GPU, custom embeddings.
“Can we record sessions for people who miss?” Yes. Recording is included at no extra cost; however, non-attendees catch up to about 70% of attendee benefit from recordings alone. Hands-on exercises do not transfer through video.
“Can you teach in Thai instead of English?” Yes. The default is Thai with English technical terms. Pure-English delivery available if the team is international.
Who this pattern fits
- In-house tech teams (5-15 engineers) who want to own Local AI long-term rather than outsource perpetually
- Agencies looking to add Local AI to their service offering without hiring specialists
- University CS programs or bootcamps (rate negotiable, typically reduced for education)
- Government agencies needing sovereign AI capability for PDPA-sensitive workloads
Does not fit: teams of 1-2 (open-enrolment workshop is better value), teams needing a production deployment today (hire the setup package, not the workshop), and teams who want a certification rather than skills.
How to engage
The starting point is a 30-minute call to scope team size, skill mix, and custom content. Email editor name with a rough number of participants and the problems you want them to learn to solve.
Workshop details on our services page; editorial operating principles on the standards page.