Dapo Vokasi
20 million rows of scattered teaching records. 7 institutions that had never shared data. One platform to make sense of it all.
The Problem
Indonesia has 14,000+ vocational high schools (SMK). The teachers who run them, guru produktif, are the ones delivering hands-on, industry-relevant skills to millions of students. Training these teachers is the job of the Directorate General of Vocational Education (Direktorat Jenderal Pendidikan Vokasi) under Kemendikdasmen.
The Directorate operates 7 specialized training centers (BBPPMPV/BPPMPV) spread across the country: Automotive & Electronics in Malang, Machinery & Industrial Engineering in Bandung, Maritime & ICT in Gowa, and more. Each runs upskilling and reskilling programs for SMK teachers nationwide.
The problem? Nobody could answer even basic questions:
- How many guru produktif are trained vs. untrained?
- Which teachers get invited to training over and over, while others never get a chance?
- Which schools hoard training access, and which are left behind?
- How is each training center actually performing?
The reason was structural. Each balai maintained its own data, in its own format, in its own systems. No shared standard. Not even a consistent way to define what a teacher's core competency is.
The Data Chaos
We started with meetings. The challenges started there too.
The data was scattered: different formats, different storage, different definitions across all 7 balai. We didn't even have a reliable way to determine a teacher's primary specialization. A teacher officially registered as a Pendidikan Agama (religious education) teacher might appear as a TKJ (computer networking) teacher in training records, simply because they taught a few hours a week in that subject at their school.
Multiply that ambiguity across the entire national teaching workforce and you get 20 million rows of teaching history with no reliable way to answer even basic questions.
We spent 2 weeks purely on data analysis, building a custom ETL pipeline in SQL Server with purpose-built algorithms to resolve identity and competency conflicts. The result: 20 million rows distilled into 300,000+ clean teacher records, each mapped to their verified primary competency. That PA teacher stays classified as a PA teacher, not misidentified as TKJ.
Raw teaching records
Clean teacher records
Scattered data from 7 balai
Identity & competency resolution
Verified teacher profiles
The Platform
With clean data in hand, the next problem was structural: how do you unify 7 independent institutions onto a single platform without breaking their autonomy?
And here's where the real challenge lives. Not in the code, but in the bureaucracy. Creating a new data field in a government system is easy. Convincing stakeholders across 7 institutions why that field matters is where the work happens.
We built a multi-tenant platform where each balai operates as its own tenant with visibility into the national teacher dataset. The stack (Laravel, Inertia.js, React.js, SQL Server) was chosen deliberately for government infrastructure compatibility and long-term maintainability within their existing IT ecosystem.
Key Decisions
- Data foundation first, features second. Before building any dashboards or APIs, the data had to be clean and the definitions had to be agreed upon. We invested 2 weeks upfront in data engineering that made everything downstream possible.
- Multi-tenant architecture over separate instances. Each balai needed autonomy over their own data while contributing to a national picture. A multi-tenant design gave them both: isolated views with shared infrastructure.
- API-first for interoperability. Rather than forcing manual data entry, we built an API layer so each balai can feed training records, certification data, and program outcomes directly into the platform from their existing systems.
- Pragmatic stack for government constraints. SQL Server was the mandated database. Laravel + Inertia.js + React.js gave us a modern, maintainable stack that the internal team can extend without specialized knowledge.
The Public Dashboard
Beyond the internal platform, we developed a public-facing dashboard at dapo.vokasi.kemendikdasmen.go.id, giving transparent access to vocational teacher data for policymakers, school administrators, and the public.
The Result
- 7 balai pelatihan unified. All 7 national training centers now share data through a single platform, the first time these institutions have operated on a shared data standard.
- 300,000+ teacher records cleaned and mapped. From 20 million rows of raw teaching history, we produced a verified dataset with each teacher mapped to their primary competency.
- Evidence-based training decisions. Balai can now see which teachers have been trained, which haven't, and who's been invited repeatedly, enabling more equitable distribution of training opportunities.
- National visibility for policymakers. Trained vs. untrained ratios, balai performance metrics, and provincial distribution data, all accessible through the public dashboard.
Timeline & Role
I served as IT Consultant and fullstack developer alongside a small team. The project spanned 6 months: 3 months of active development (data engineering, platform build, API, and dashboard) followed by 3 months of seminars and onboarding sessions across all 7 balai pelatihan nationwide.
Lessons Learned
The biggest lesson wasn't technical, it was foundational. In government projects, you have to clear the data foundation before building anything. We spent weeks just getting 7 institutions to agree on what a "teacher's primary competency" even means, institutions that had never shared definitions before.
The pattern repeated at every layer. Building the field is easy. Getting 7 different institutions to agree on why that field matters, that's the real engineering.
Related Notes
Need a data platform or government-scale system built?
I've shipped platforms that unify data across institutions. Let's talk about your project.
Book an Intro Call