A One-Command Health Check Workflow for Non-Coders Building with AI

AI tools are a game-changer for non-developers like me.

The idea of creating complex apps without a technical background was unthinkable, literally, only two years ago. Now, the sky is the limit, and the ability afforded to people without a coding background continues to expand.

The issue with all this power is that you can also quickly get yourself into trouble with the code you produce. I don't always know if the code AI produces in a vibe-coding session is high-quality, secure, or architected properly. Nor do I know what the downstream effects of a change may be. Early on, I found that changes I made through a prompt would sometimes break an experience in an unexpected way.

My initial solution? Click through the app after every change. But I knew right away this was too time-consuming and would not scale as an app grows. Worse, I realized things were probably breaking in ways I couldn't see. A feature that may work today but fails when tried in a different way, a password accidentally stored in the wrong place, a piece of code that works individually but breaks something elsewhere.

I needed an automated way to audit code and ensure core experiences were working properly.

I've hung around tech enough to know there are solutions to this. Engineering teams write unit tests and run QA. But learning to write tests myself or hiring someone to do it weren't options I had.

My solution was to have AI build the testing system for me using a single command that runs every check I need, automatically.

The Workflow: /test-core

/test-core is one command that runs eight checks in sequence.

  1. Secrets Safety: Verifies API keys and passwords aren't exposed

  2. Code Quality: Scans for style violations and common mistakes

  3. Type Safety: Ensures data types are used correctly

  4. Security Audit: Checks dependencies for known vulnerabilities

  5. Unit Tests: Verifies core logic with automated spot-checks

  6. Build Verification: Compiles the app to catch integration issues

  7. End-to-End Tests: Simulates real user interactions in a browser

  8. Pre-Deploy Checklist: Surfaces manual checks before going live

If anything fails, the workflow stops and tells me exactly what's wrong. If everything passes, I know the app is healthy.

Step 1: Secrets Check

bash

# Verify .env is in .gitignore
grep -q "^\.env" .gitignore || (echo "ERROR: .env not in .gitignore!" && exit 1)

# Verify .env is not tracked by git
git ls-files --error-unmatch .env 2>/dev/null && (echo "ERROR: .env is tracked!" && exit 1) || true
    

(👉 First things first: you should have a .env file. If not, stop now and implement this before doing anything else.)

Running a Secrets Check is the first step, because if secrets are exposed, nothing else matters.

.env files store sensitive information like API keys, database passwords, and other credentials that make your app work. It’s good practice to have this file, but if it ever gets uploaded to your code repository (like GitHub), those secrets are exposed. Anyone can see them. It's like accidentally leaving your house key under the mat with a sign pointing to it.

The addition of “.gitignore” here is telling my app to ignore this file when committing to GitHub, ensuring that any information I don’t want exposed stays secret.

This step checks two things:

  1. that your project is configured to never upload this file

  2. that it hasn't been accidentally uploaded already.

Step 2: Code Standards (Lint)

bash

npm run lint
    

Think of this as spell-check for your code. It scans every file looking for common mistakes: unused variables, inconsistent patterns, things that technically work but could cause problems later. The AI writes a lot of code quickly, and it doesn't always clean up after itself.

Step 3: Type Safety

bash

npm run typecheck
    

My app is written in TypeScript (for reasons I won’t get into here - yet 😀). TypeScript is a version of JavaScript that adds "types." These types are labels that tell the code what kind of information to expect. Is this a number? A piece of text? A list? If a number is passed where text is expected, things can break in subtle ways.

The purpose of this step is to make sure every puzzle piece goes in the right slot. The spell-checker (Step 2) won't catch these errors because some of those rules are intentionally relaxed during early development. This step catches what the spell-checker misses.

Step 4: Security Audit

bash

npm audit --audit-level=high
    

Every app depends on dozens (sometimes hundreds) of packages built by other developers. It's like building with LEGO bricks, except some of those bricks might have flaws.

This step scans every package your app uses and checks it against a database of known security problems. I set it to only fail on serious issues (rated "high" or "critical"), so normal maintenance items don't block progress.

Step 5: Unit Tests

bash

npm run test
    

This is the step I understood the least when I started, so let me explain it the way I wish someone had explained it to me.

A unit test is a tiny, automatic spot-check on one specific piece of logic in your app. It doesn't open a browser. It doesn't click anything. It just asks: "If I give this function these inputs, does it give me the right answer?"

For example, one of my unit tests checks what happens when you create a Tayle with the same name as a Tayle that already exists. Does it add "(1)" to the end? What about "(2)"? What if there's a gap in the numbering? The test feeds in different scenarios and verifies each one gives the correct result.

Here's what one of those tests actually looks like:

javascript

// "If I create a Tayle called 'Career Highlights'
//  and one already exists, does it add (1)?"
test("appends (1) when the name already exists", () => {
    existingNames(["Career Highlights"]);
    result = getUniqueTayleTitle("Career Highlights");
    expect(result).toBe("Career Highlights (1)");  // ✅
});
    

I didn't write this. The AI did. I just said "make sure duplicate names get a number added."

My 13 unit tests cover two areas: naming logic (what happens when you create something with a duplicate name?) and template display (does each template show the right image and color?). These are small pieces of logic that many other features depend on. If they break, lots of things break. They run in about 1 second and catch logic errors instantly. This way I take care of subtle bugs that pass every other check but would frustrated a user.

Step 6: Build

bash

npm run build
    

This is the dress rehearsal before showtime. It takes the entire app, every file, every component, every dependency, and assembles it into the final version that would go to your users. Individual pieces might work perfectly on their own, but the build step forces everything to come together. Configuration conflicts, missing connections, and other integration issues surface here.

If the build passes, the app can be deployed.

Step 7: End-to-End UX Tests

bash

npx playwright test
    

This is the step that gives me the most confidence, and it's the closest thing to what I used to do manually. Now its automated and is finished in 10 seconds, instead of me doing it in 10 minutes.

Playwright is a tool that opens a real browser and pretends to be a user. It clicks buttons, fills forms, navigates between pages, and checks that everything works the way a real person would experience it.

Here’s an example of a test I run:

javascript

// Click the "New Tayle" button
const newTayleCard = page.getByRole('button', { name: 'New Tayle' });
await newTayleCard.click();

// Verify we landed on the new Tayle's page
await expect(page).toHaveURL(/tayles/);  // ✅
    

My tests cover the things I care about most:

  • Does the app load?

  • Can I create a new Tayle from scratch?

  • Can I save an entry using a template?

  • Does switching between interview modes work?

  • Is the navigation stable?

If something the AI did passes every other check, but broke something in the interface, Playwright will catch it. This is a huge level-up for me in terms of efficiency and quality.

Step 8: Pre-Deploy Reminder

After all seven automated checks pass, the workflow surfaces a manual checklist for things that require human judgment before pushing to production: reviewing access controls, verifying environment settings, confirming database security. Not everything can be automated, and some decisions still need a person.

Why the Order Matters

The steps are sequenced intentionally:

  1. Secrets first — the cheapest check and the most catastrophic failure mode.

  2. Code quality and types next — fast, catches surface-level issues.

  3. Security audit — checks your dependencies for known vulnerabilities.

  4. Unit tests — verifies core logic is correct.

  5. Build — everything that matters at the code level.

  6. E2E tests last — the most expensive check, but the most comprehensive.

If anything fails, the workflow stops immediately. You don't waste time running expensive browser tests when a simple type error would have told you something was wrong in 2 seconds. This is how you "fail fast" and catch problems at the cheapest possible moment.

What I'd Add Next

This workflow covers a lot, but it doesn't cover everything. Here's what I'm exploring for future iterations:

Backend function testing. My app has server-side functions that handle things like sending emails, processing AI responses, and managing user data. Right now, the only way I'd know one of these is broken is when a user reports it. Adding a compile check or basic smoke test for these functions would close that gap.

Accessibility testing. This checks if your app is usable by people with screen readers, keyboard-only navigation, or visual impairments. Playwright actually supports this out of the box. It can scan your pages for missing labels, low contrast, and broken focus order. It's increasingly a legal requirement, and it's on my list.

Visual regression testing. Playwright can take screenshots and compare them against known-good versions. If a CSS change accidentally makes a button invisible or shifts your layout, it flags the difference. It catches visual bugs that pass every other test.

These are all things I plan to add over time. The beauty of the workflow structure is that adding a new check is just adding a new step.

What I Learned

Building this workflow taught me three things:

1. Testing isn't one thing. There are different kinds: style checks, type checks, security scans, logic tests, browser tests. Each catches a different kind of problem.

2. The AI can build your tests for you. I didn't need to understand how to write browser automation from scratch. I described what I wanted verified (e.g. "make sure creating a Tayle from a template works end-to-end") and the AI built the test. I just needed to know what to test, not how to test it.

3. Confidence changes everything. Before this workflow, every session ended with uncertainty. Now it ends with a definitive answer. I'm willing to make bigger changes because I know the safety net will catch me if something breaks. It’s a small implementation detail that changed how I build.

Get the Full Workflow

The complete, copy-paste-ready workflow file is available on GitHub:

vibe-coding-workflows/test-core.md

The repo includes:

  • The full test-core workflow with implementation examples for npm, Python, Rust, and Go

  • The agent rules file from my first blog post

  • Setup instructions for different AI coding tools

  • Adaptation guides for any tech stack

You can copy the files directly, fork the repo, or use them as a reference for building your own version.

What's Next

In my next posts, I'll keep walking through my processes and workflows, including how I design with AI and create a design system that vibe-coding platforms can quickly and easily read.

And, as always, if you're building with AI and want to compare approaches, reach out. I'm happy to talk shop.

Previous
Previous

Designing with AI: My Figma-to-Code Workflow

Next
Next

How I Use Different AI Models for Planning, Execution, and Review