Personal Work: I Remember the Light,

pixxelkick · 6 hours ago

I thought like, canonically, avada kedavra fucks up your soul or whatever everytime you use it and it slowly corrupts you or something

So it has a downside.

Also, to work, you have to be able to mean it and basically be a psychopath for it to even work right.

Isn’t there explanations for why people dont just use it willy nilly?

pixxelkick · 11 days ago

The wrong way (that most devs fall into) is to let the LLM generate the code, then to say “write tests for this”.

Its important to note that this problem is jist as prevalent with humans, Ive met countless devs that fall into this se trap with their own code and tests.

The whole point of TDD is writing the tests first, thats why its called test Driven development.

If your instructions emphasize TDD this definitely helps steer the agent better in the direction of “tests first, then make the tests pass”

This works especially well with a bug, if I tell the LLM “x bug report was found, first replicate the bug with a test, and then fix it” their efficacy skyrockets.

They’re pretty capable of first succeeding at replicating the bug, and once they have a test that does that they pretty much always can solve the problem if its not super super esoteric.

pixxelkick · 11 days ago

See my other post about how beauty can literally be found in anything, and how the practice of art is a lot like working out a muscle group. The act of finding beauty takes practice, its nothing to be ashamed about lacking as a practiced skill, but anyone can gain that skill by working at it.

But once you can find beauty in anything, even an apple sitting on a plate, it does become easier to find beauty in anything else, including oneself.

pixxelkick · 11 days ago

but why do we need to write that boilerplate again and again in the first place?

No matter how hard you try and abstract any system, you must somewhere at some point encode and describe the domain rules of how the application will work.

The biggest offender are tests. You must define tests in a way that every test works in a glass box.

Having quiet side effects in tests is bad practice.

So you end up with every test being 99% boilerplate in their “arrange” sections as you must set the stage to assert upon.

There is no way to abstract your way through this, it simply must be done.

Its a classic pitfall of developers to keep trying to abstract their stuff to hell and back again.

Each time you abstract you narrow your capabilities. Your abstracted concept tightens what the logic can do, because you literally abstracted pieces of it away.

The more you abstract, the more rigid and brittle your capabilities become.

Striking the balance of abstracting stuff just enough to be the right amount of flexible but also rigid is basically the whole art of architecture and takes years of practice.

You will always have boilerplate. Period.

And thus you cannot get away from the fact that LLMs are pretty dang good at that specific part of the job.

pixxelkick · edit-2 11 days ago

Did you have MCP tooling setup so it can get lsp feedback? This helps a lot with code quality as it’ll see warnings/hints/suggestions from the lsp
Unit tests. Unit tests. Unit tests. Unit tests.

I cannot stress enough how much less stupid LLMs get when they jave proper solid Unit tests to run themselves and compare expected vs actual outcomes.

Instead of reasoning out “it should do this” they can just run the damn test and find out.

They’ll iterate on it til it actually works and then you can look at it and confirm if its good or not.

I use Sonnet 4.5 / 4.6 extensively and, yes, its prone to getting the answer almost right but a wrong in the end.

But the unit tests catch this, and it corrects.

Example: I am working on my own fame engine with monogame and its about 95% vibe coded.

This transform math is almost 100% vibe coded: https://github.com/SteffenBlake/Atomic.Net/blob/main/MonoGame/Atomic.Net.MonoGame/Transform/TransformRegistry.cs

The reason its solid is because of this: https://github.com/SteffenBlake/Atomic.Net/blob/main/MonoGame/Atomic.Net.MonoGame.Tests/Transform/Integrations/TransformRegistryIntegrationTests.cs

Also vibe coded and then sanity checked by me by hand to confirm the math checks out for the tests.

And yes, it caught multiple bugs, but the agent automatically could respond to that, fix the bug, rerun the tests, and iterate til everything was solid.

Test Driven Development is huge for making agents self police their own code.

pixxelkick · 11 days ago

that’s still really hard to do when you don’t like how you look.

Never said it was easy, just that it was important!

One of the best ways I found to discover beauty was practicing art. The process of discovering beauty in anything and anyone makes it a lot less challenging to then find beauty in yourself.

When you can look at a mark on your body and go “I love how this breaks up the negative space and adds an interesting dynamic rhythm to my features” you are on the path.

The human body is full of beautiful curves, contours, patterns, shapes.

And so are apples.

If I can find beauty in a picture of an apple sitting on a table, it sure isnt more challenging to see the same of my own reflection haha.

pixxelkick · 11 days ago

Some need awareness that external validation exists

Hard disagree, external validation is not a fundamental need.

Any perception that it is, is an internalized mental blocker (likely due to countless years of social conditioning)

Your cage is made of out of sticks.

pixxelkick · 11 days ago

But what if I am the one who hates my body the most

That is still the only opinion that matters.

Matters, in the sense of “should determine action”

If 100 millionnpeople say you are ugly, they can go fuck themselves.

But if you say you are ugly, that matters and should be addressed and worked on.

pixxelkick · 12 days ago

I dunno Beholders have a tendency to think pretty highly of themselves though… 🤔

pixxelkick · 12 days ago

deleted by creator

pixxelkick · 12 days ago

I actually dislike this.

The narrative that your value comes from external validation is one you see a lot.

But a person shouldn’t be told that “you are beautiful based on another persons perspective” because that cuts both ways.

If you tell them this, it can easily be flipped around to say “okay but theres 10 million people on the internet who would love to call you ugly so thats 10 million to 1”

Instead, the only person whos opinion on your body that matters is YOU, and thats it.

And Ill keep banging that drum.

If I have a daughter, Im gonna tell her this all the time, “tell anyone who tries to convince you that beauty is in the eye of the beholder, that the beholder outta keep their fuckin opinions to themself”

pixxelkick · 12 days ago

Theres also a massive distinction between consuming something necessary/important, vs consuming something 100% optional.

Harry Potter isnt food, shelter, or any other kind of critical necessity.

Theres literally countless better alternatives to Harry Potter media you can choose to consume from that doesnt directly put money straight into the pocket of someone actively funding direct harm

This isn’t multiple layers of washing here, that money basically goes straight towards actively harming minority groups.

Its not even a good fucking book, and I used to be a fan of it as a kid, but I went back and read my old books and… it just fuckin sucks dawg, its not good lol.

Go pick like, any other fandom at least.

pixxelkick · 12 days ago

I mean for what its worth, if the dude has a partner they couldve just picked the kid up, its not exactly unheard of for one partner to do drop off and the other do pick up… kind of a lukewarm take here.

pixxelkick · 12 days ago

The difference, when the tool is used correctly, is so massive that only someone deeply uninformed or naive would contend it.

I got about 4 entire days worth of work completed in about 5 hours yesterday at my job, thats just objective fact.

Tasks that used to take weeks now take days, and tasks that used to take days now take hours. Theres no “feeling” about this, Ive been a software developer for approaching 17 years now professionally. I know how long it takes to produce an entire gambit of integration tests for a given feature. I spend almost all of my time now reviewing mountains of code (which is fairly good quality, the machines produce fairly accurate results), and then a small amount of time refining it.

People deeply do not at all understand how dramatically the results have changed over the past 2 years, and their biases are based on how things were 2 years ago.

Sure, 2 years ago the quality was way worse, the security was bad, the enforcement almost non existent, and peoples overall skill with how to use the tools was just beginning to grow. You cant exactly be good at using a tool that only just came out.

But its been two years of very rapid improvement. Its good now. Anyone who has been using these tools and actually monitoring progression can speak to this.

Things heavily shifted about 5 months ago when competition started to really fire up between different providers, and I wont say its even close to great yet, but its definitely good, it works, its fast, and it’s pretty damn good at what I need it to do.

pixxelkick · 15 days ago

Go find the richest person in your local city.

You have one, they exist. Probably several in the same area.

Make it their problem.

pixxelkick · 15 days ago

We arnt in active physical danger.

You arent in percieved active physical danger.

If trump launches a nuke boy that sure will change lickity split tho

pixxelkick · 15 days ago

You know programmers who use llms believe they’re much more productive because they keep getting that dopamine hit, but when you actually measure it, they’re slower by about 20%.

Everyone keeps citing this preliminary study and ignores:

Its old now
Its sample size was incredibly tiny
Its sample group were developers not using proper tooling or trained on how to use the tools

Its the equivalent of taking 12 seasoned carpenters with very little experience on industrial painting, handing them industrial grade paint guns that are misconfigured and uncalibrated, and then asking them to paint some of their work and watching them struggle… and then going “wow look at that industrial grade paint guns are so bad”

Anyone with any sense should look at that and go “thats a bogus study”

But people with intense anti-ai bias cling to that shoddy ass study with such religious fervor. Its cringe.

Every professional developer with actual training and actual proper tooling can confirm that they are indeed tremendously more productive.

pixxelkick · 15 days ago

I find this only is the direction sought by half baked devs who aren’t bothering to actually proof read the stuff their agents churn out.

They “trust” it without any true proof of trust.

Agents are INCREDIBLY prone to fudging and faking “success” metrics, especially when put under context pressure.

Ive seen everything from commenting out tests to fake passes, or changing the asserts on tests to fake a success, to just slapping “to be implemented later” and then calling that done.

You fundamentally cannot automate away proving that an agent actually did its job right, full stop. You can make it write tests, but now how do you know the tests were written right?

At some point you HAVE to actually sit and read the code, read the diffs, and check the work. If you don’t, you are opening yourself up to all manner of problems, especially if whatever you are working on is remotely sensitive. If the tool/app/whatever has any kind of auth or handles any kind of sensitive data, you MUST still be auditing every change.

And thus, the IDE continues to still be the tool I prefer to sit and sanity check the code as it gets produced.

Doesnt matter which one I use, I need the ability to live read and diff code and steer the agent away from disaster.

If you blindly trust agents without constantly auditing their code, you are just setting yourself up for failure.

pixxelkick · 15 days ago

Lovely anthropic mcp. Make sure you give anthropic lots of money and use their tools

Its becoming clear you have no clue wtf you are talking about.

Model Context Protocol is a protocol, like http or json or etc.

Its just a format for data, that is open sourced and anyone can use. Models are trained to be able to invoke MCP tools to perform actions, and anyone can just make their own MCP tools, its incredibly simple and easy. I have a pretty powerful one I personally maintain myself.

Anthropic doesnt make any money off me, in fact, I dont use any of their shit, except maybe whatever licensing fees microsoft pays to them to use Claude Sonnet, but microsoft copilot is my preferred service I use overall.

I bet you your contract with them says they’re not liable for shit their llm does to your files

Setting aside the fact that I dont even use anthropic’s tools, my copilot LLMs dont have access to my files either. Full stop.

The only context in which they do have access to files is inside of the aforementioned docker based sandbox I run them inside of, which is an ephemeral immutable system that they can do whatever the fuck they want inside of because even if they manage to delete /var/lib or whatever, I click 1 button to reboot and reset it back to working state.

The working workspace directory they have access to has readonly git access, so they can pull and do work, but they literally dont even have the ability to push. All they can do is pull in the stuff to work on and work on it

After they finish, I review what changes they made and only I, the human, have the ability to accept what they have done, or deny it, and then actually push it myself.

This is all basic shit using tools that have existed for a long time, some of which are core principles of linux and have existed for decades

Doing this isnt that hard, its just that a lot of people are:

Stupid
Lazy
Scared of linux

The concept of “make a docker image that runs an “agent” user in a very low privilege env with write access only to its home directory” isnt even that hard.

It took me all of 2 days to get it setup personally, from scratch.

But now my sandbox literally doesnt even expose the ability to do damage to the llm, it doesnt even have access to those commands

Let me make this abundantly clear if you cant wrap your head around it:

LLM Agents, that I run, dont even have the executable commands exposed to them to invoke that can cause any damage, they literally dont even have the ability to do it, full stop

And it wasnt even that hard to do

pixxelkick · 16 days ago

You’ll be the 4753rd guy with the oops my llm trashed my setup and disobeyed my explicit rules for keeping it in check

Read what I wrote.

Its not a matter of “rules” it “obeys”

Its a matter of literally not it even having access to do such things.

This is what Im talking about. People are complaining about issues that were solved a long time ago.

People are running into issues that were solved long ago because they are too lazy to use the solutions to those issues.

We now live in a world with plenty of PPE in construction and people are out here raw dogging tools without any modern protection and being ShockedPikachuFace when it fails.

The approach of “Im gonna tell the LLM not to do stuff in a markdown file” is tech from like 2 years ago.

People still do that. Stupid people who deserve to have it blow up in their face.

Use proper tools. Use MCP. Use a sandbox environment. Use whitelist opt in tooling.

Agents shouldn’t even have the ability to do damaging actions in the first place.

pixxelkick · 1 year ago

Personal Work: I Remember the Light,

pixxelkick · 2 years ago

Any Poly folks here that are forever Monagomous?

pixxelkick · 3 years ago

Self Hosted Database ERD Manager?

pixxelkick · 3 years ago

I feel like I am crazy, where is the login?

pixxelkick · 3 years ago

What is the planned solution for cross-host link sharing?

LLM Agents, that I run, dont even have the executable commands exposed to them to invoke that can cause any damage, they literally dont even have the ability to do it, full stop

Moderates