Automation as Specification

Automation as Specification
Photo by Lenny Kuhne / Unsplash

Documentation decays. Text files describe how a system should work, but code describes how it does work. When the two diverge, code wins. This is not a moral judgment; it is a fact of execution. The system runs the code, not the documentation. But what if documentation were executable? What if the specification of system requirements was also the mechanism to satisfy them?

An automated environment setup script was introduced for a research automation system. The script generated database credentials, JWT secrets, and API keys, then wrote them to an .env file. On the surface, this is routine DevOps automation. But the script is more than convenience; it is a specification that cannot lie.

The script begins by checking for an existing .env file. If one exists, it loads current values. This preserves continuity: database passwords, once set, remain stable across script reruns. If the file does not exist, the script generates new credentials. This logic encodes an operational requirement that might be implicit in written documentation: credentials must persist, not regenerate on every invocation.

This behavior was refined over time. The original script regenerated database passwords unconditionally, breaking existing connections. The fix added a conditional: if DATABASE_PASSWORD exists in the environment, use it; otherwise, generate a new one. This is a small change, but it reveals how automation surfaces edge cases. Written documentation might say "generate credentials," omitting the persistence requirement. The script cannot omit it; the system breaks if it does.

The script generates passwords using openssl rand -base64 32. This produces a 32-byte random value, base64-encoded, suitable for use as database credentials or secrets. The choice of 32 bytes is deliberate: it provides 256 bits of entropy, beyond the reach of brute-force attacks with current technology. The script does not explain this choice, but the code documents it: 32 bytes is the specification.

A note was added about URL-encoding special characters in database passwords. Prisma, the ORM used by the system, expects database connection strings in URL format: postgresql://user:password@host:port/database. If the password contains special characters like @ or /, the URL parser interprets them as delimiters, breaking the connection. The solution is to URL-encode the password: @ becomes %40, / becomes %2F.

This requirement is not obvious from the Prisma documentation, which focuses on schema definition and query building. It emerges from the interaction between Prisma's connection library and URL parsing. A written document might miss this detail; a setup script cannot. If the script does not handle URL encoding, the database connection fails, and the failure is immediate and obvious.

The script also configures Clerk authentication keys. It was updated to use Clerk's development environment, which is free and suitable for testing. The script prompts for CLERK_PUBLISHABLE_KEY and CLERK_SECRET_KEY, then validates them by checking the key prefix: pk_test_ for publishable keys, sk_test_ for secret keys. This validation is minimal but effective: it catches typos and ensures keys match the expected environment (test vs. production).

What makes this script a specification? It encodes not just what the system needs, but how those needs should be satisfied. It specifies the length of generated secrets (32 bytes), the format of connection strings (URL-encoded), the environment for authentication (Clerk test keys), and the persistence of credentials (preserve existing values). These details exist in the code and nowhere else. If someone asks "What does the system require to run?" the answer is: run the setup script and observe what it generates.

This approach has a trade-off. Executable specifications are harder to read than prose. A paragraph of text can explain context and reasoning; a shell script shows only mechanics. But the script has a property that prose lacks: it is always accurate. If the script runs successfully, the system is configured correctly. If it fails, the configuration is incomplete. There is no ambiguity.

The deployment history reveals how the script evolved. The initial version was created, then refined to fix credential persistence, then updated with URL-encoding notes, then configured for Clerk environments. Each change responded to a discovered requirement — something the system needed but the script did not yet provide. The evolution is a trace of learning: how the system reveals its actual needs through operation.

This pattern — automation as specification — extends beyond setup scripts. Database migration scripts specify schema changes. CI/CD pipelines specify build and deployment steps. Infrastructure-as-code templates specify resource provisioning. In each case, the code is both the documentation and the implementation. You cannot have documentation drift because there is no separate documentation to drift from.

But this only works if the scripts are maintained. A setup script that no longer matches current system requirements is worse than no script at all: it creates false confidence. The system appears to be configurable but produces a broken configuration. Version control is essential here. By committing changes to the script alongside changes to the system, you ensure they stay synchronized. Each change that modifies environment requirements should update the setup script in the same transaction.

The script also serves as a contract with future operators. Someone deploying the system six months from now does not need to read setup instructions, track down credential formats, or guess at appropriate secret lengths. They run the setup script, and the script handles those details. This reduces cognitive load and eliminates a class of deployment errors: misconfigured environments.

Written documentation still has a role. It provides context, explains trade-offs, and describes why the system works the way it does. But for operational tasks — configuration, deployment, migration — executable specifications are superior. They are always correct, always up-to-date, and always reproducible.

The setup script is 150 lines of shell code. It is not elegant or particularly clever. But it is precise. It specifies exactly what the system requires and exactly how to provide it. That precision is valuable. It eliminates ambiguity, reduces errors, and creates a foundation for reliable deployment.

In a broader sense, this reflects a principle: make assumptions explicit through automation. If the system requires 32-byte secrets, encode that in the generation script. If database passwords need URL encoding, encode that in the setup process. If authentication keys must match a specific environment, validate that in the configuration step. Each requirement, made explicit and executable, becomes a constraint the system enforces on itself.

The deployment history is the record of those constraints emerging. Early changes show the system as imagined; later changes show the system as it actually must be. The difference between the two is the knowledge you gain by running the system. Automation as specification captures that knowledge in a form that persists and compounds.