I Tried Adding Randomness to My Prompts for 14 Days—The Results Surprised Me

byAditya Nurrohman -January 22, 2026

0

I’m getting tired of the "Prompt Engineering" hype. Seriously.

Everywhere I look, some guru is selling a cheat sheet on how to say "please" to a robot. They act like if you just swap "Write a blog post" with "Craft a compelling narrative," the AI will suddenly turn into Shakespeare.

I call BS.

I spent the last two weeks running a little experiment. I stopped obsessing over my words. Instead, I started messing with the engine settings—the actual math the model uses to pick the next word.

And.. the results were kinda shocking.

I realized we’ve been looking at this all wrong. It’s not about better prompts. It’s about "decoding."

"Once I started thinking about LLMs this way — as probabilistic next-token generators rather than deterministic answer machines — the behavior I was seeing began to make sense." - Aditya

The "Safe Mode" Trap

Here is the thing most people don’t get.

When you use ChatGPT or Claude out of the box, it often defaults to something close to "Greedy Decoding."

Sounds fancy, right? It isn't.

Greedy decoding just means the model looks at the probability of the next word and picks the absolute safest, most likely option. Every. Single. Time.

So if you type "The cat sat on the...", the model picks "mat." It never picks "roomba" or "edge of the universe."

I was reading this deep dive into LLM decoding and it clicked. This is why your AI content feels robotic. It’s not your prompt. It’s because the model is playing it safe.

It maximizes "local likelihood." It doesn't care about the big picture.

Twist the Knobs (But Not the Way You Think)

So I spent 14 days turning the knobs. Specifically, Temperature, Top-k, and Top-p.

Most people think Temperature adds "creativity."

Wrong.

It reshapes the probability curve. Low temp? The model is a strict librarian. High temp? It’s a drunk poet.

But here is where it gets interesting. Top-k vs. Top-p.

Top-k is dumb. You tell the model, "Only look at the top 10 words." Doesn't matter if the 11th word was perfect. It’s cut off.

Top-p (Nucleus Sampling) is the secret sauce. Instead of a fixed number, you set a cumulative probability (like 0.9). The model looks at the top words until their combined likelihood hits 90%.

If the model is confident? It only looks at 2 words.

If the model is confused? It widens the net.

It adapts. Finally, something smart.

The "Chaos" Breakdown

After running the same prompts through different decoding strategies for two weeks, here is the hard data on how they actually felt.

Strategy	The Vibe	The Risk	Best Used For
Greedy Decoding	Robotic, safe, boring.	Zero creativity. Loops text.	Math, Coding, basic facts.
High Temperature	Wild, expressive, erratic.	Hallucinations galore.	Brainstorming fiction.
Beam Search	Polished but lifeless.	Creates "Corporate Speak."	Translations.
Top-p (Nucleus)	The Sweet Spot. Balanced.	Can still drift if too high.	Chat, writing, actual human talk.

The 14-Day Verdict

My first instinct was always to blame the prompt.

If the answer was bad, I’d rewrite the input. I’d add constraints. I’d threaten the AI (we’ve all done it).

But after this experiment, I realized I was screaming at the driver when the car was in the wrong gear.

When I switched to Top-p sampling with moderate temperature, the answers didn't just change. They felt... human. The model explored alternatives when it was unsure but locked in when it knew the answer.

It wasn't mechanical anymore.

And I didn't change a single word of my prompt.

"Decoding strategy directly shapes how an LLM behaves under uncertainty... It influences whether responses are cautious or expressive, repetitive or exploratory." - Aditya

Stop Tuning Words, Start Tuning Math

Here is my advice.

If you are building an app or just using the API, stop spending 10 hours refining your prompt text.

Go into the settings.

Kill the "Greedy" defaults. Ignore Beam Search unless you're doing translation. Set your Top-p to around 0.9 and your Temperature to 0.7.

You aren't a prompt engineer anymore. You're a chaos manager. Act like one.

I Tried Adding Randomness to My Prompts for 14 Days—The Results Surprised Me

The "Safe Mode" Trap

Twist the Knobs (But Not the Way You Think)

The "Chaos" Breakdown

The 14-Day Verdict

Stop Tuning Words, Start Tuning Math

Post a Comment

Ad 1

Ad 2

4 Prompt Tips I Ignored That Set Me Back Big Time

I Regret Letting an Automation System Fire a Client—Here’s How I Broke Trust

Categories

Latest Posts

Popular Posts

4 Prompt Tips I Ignored That Set Me Back Big Time

3 AI Writing Mistakes That Almost Ruined My Career

Why I Finally Stopped Using ChatGPT for Every Task

Contact Form