Show HN: A state-based narrative engine for tabletop RPGs
2 by KoeppyLoco | 0 comments on Hacker News.
I’m experimenting with modeling tabletop RPG adventures as explicit narrative state rather than linear scripts. Everdice is a small web app that tracks conditional scenes and choice-driven state transitions to preserve continuity across long or asynchronous campaigns. The core contribution is explicit narrative state and causality, not automation. The real heavy lifting is happening in the DM Toolkit/Run Sessions area, and integrates CAML (Canonical Adventure Modeling Language) that I developed to transport narratives among any number of platforms. I also built the npm CAML-lint to check validity of narratives. I'm interested in your thoughts. https://ift.tt/GzhAtbg
Hack Nux
Watch the number of websites being hacked today, one by one on a page, increasing in real time.
New Show Hacker News story: Show HN: Morph – Videos of AI testing your PR, embedded in GitHub
Show HN: Morph – Videos of AI testing your PR, embedded in GitHub
7 by bhaktatejas922 | 1 comments on Hacker News.
I review PRs all day and I've basically stopped reading them. Someone opens a 2000-line PR, I scroll, see it's mostly AI-generated React components, leave a comment, merge. I felt bad about it until I realized everyone on my team does the same thing. The problem is diffs are the wrong format. A PR might change how three buttons behave. Staring at green and red lines to understand that is crazy. The core reason we built this is that we feel that products today are built with assumptions from the past. 100x code with the same review systems means 100x human attention. Human attention cannot scale to fit that need, so we built something different. Humans are provably more engaged with video content than text. So we RL trained and built an agent that watches your preview deployment when you open a PR, clicks around the stuff that changed, and posts a video in the PR itself. Hardest part was figuring out where changed code actually lives in the running app. A diff could say Button.tsx line 47 changed, but that doesn't tell you how to find that button. We walk React's Fiber tree where each node maps back to source files, so we can trace changes to bounding boxes for the DOM elements. We then reward the model for showing and interacting within it. This obviously only works with React so we have to get more clever when generalizing to all languages. We trained an RL agent to interact with those components. Simple reward: points for getting modified stuff into viewport, double for clicking/typing. About 30% of what it does is weird, partial form submits, hitting escape mid-modal, because real users do that stuff and polite AI models won't test it on their own. This catches things unit tests miss completely: z-index bugs where something renders but you can't click it, scroll containers that trap you, handlers that fail silently. What's janky right now: feature flags, storing different user states, and anything that requires context not provided. Free to try: https://ift.tt/lWs2it3 Demo: https://www.youtube.com/watch?v=Tc66RMA0nCY
7 by bhaktatejas922 | 1 comments on Hacker News.
I review PRs all day and I've basically stopped reading them. Someone opens a 2000-line PR, I scroll, see it's mostly AI-generated React components, leave a comment, merge. I felt bad about it until I realized everyone on my team does the same thing. The problem is diffs are the wrong format. A PR might change how three buttons behave. Staring at green and red lines to understand that is crazy. The core reason we built this is that we feel that products today are built with assumptions from the past. 100x code with the same review systems means 100x human attention. Human attention cannot scale to fit that need, so we built something different. Humans are provably more engaged with video content than text. So we RL trained and built an agent that watches your preview deployment when you open a PR, clicks around the stuff that changed, and posts a video in the PR itself. Hardest part was figuring out where changed code actually lives in the running app. A diff could say Button.tsx line 47 changed, but that doesn't tell you how to find that button. We walk React's Fiber tree where each node maps back to source files, so we can trace changes to bounding boxes for the DOM elements. We then reward the model for showing and interacting within it. This obviously only works with React so we have to get more clever when generalizing to all languages. We trained an RL agent to interact with those components. Simple reward: points for getting modified stuff into viewport, double for clicking/typing. About 30% of what it does is weird, partial form submits, hitting escape mid-modal, because real users do that stuff and polite AI models won't test it on their own. This catches things unit tests miss completely: z-index bugs where something renders but you can't click it, scroll containers that trap you, handlers that fail silently. What's janky right now: feature flags, storing different user states, and anything that requires context not provided. Free to try: https://ift.tt/lWs2it3 Demo: https://www.youtube.com/watch?v=Tc66RMA0nCY
New ask Hacker News story: Ask HN: Mem0 stores memories, but doesn't learn user patterns
Ask HN: Mem0 stores memories, but doesn't learn user patterns
4 by fliellerjulian | 2 comments on Hacker News.
We're a YC W23 company building AI agents for engineering labs - our customers run similar analyses repeatedly, and the agent treated every session like a blank slate. We looked at Mem0, Letta/MemGPT, and similar memory solutions. They all solve a different problem: storing facts from conversations — "user prefers Python," "user is vegetarian." That's key-value memory with semantic search. Useful, but not what we needed. What we needed was something that learns user patterns implicitly from behavior over time. When a customer corrects a threshold from 85% to 80% three sessions in a row, the agent should just know that next time. When a team always re-runs with stricter filters, the system should pick up on that pattern. So we built an internal API around a simple idea: user corrections are the highest-signal data. Instead of ingesting chat messages and hoping an LLM extracts something, we capture structured events — what the agent produced, what the user changed, what they accepted. A background job periodically runs an LLM pass to extract patterns and builds a confidence-weighted preference profile per user/team/org. Before each session, the agent fetches the profile and gets smarter over time. The gap as I see it: Mem0 = memory storage + retrieval. Doesn't learn patterns. Letta = self-editing agent memory. Closer, but no implicit learning from behavior. Missing = a preference learning layer that watches how users interact with agents and builds an evolving model. Like a rec engine for agent personalization. I built this for our domain but the approach is domain-agnostic. Curious if others are hitting the same wall with their agents. Happy to share the architecture, prompts, and confidence scoring approach in detail.
4 by fliellerjulian | 2 comments on Hacker News.
We're a YC W23 company building AI agents for engineering labs - our customers run similar analyses repeatedly, and the agent treated every session like a blank slate. We looked at Mem0, Letta/MemGPT, and similar memory solutions. They all solve a different problem: storing facts from conversations — "user prefers Python," "user is vegetarian." That's key-value memory with semantic search. Useful, but not what we needed. What we needed was something that learns user patterns implicitly from behavior over time. When a customer corrects a threshold from 85% to 80% three sessions in a row, the agent should just know that next time. When a team always re-runs with stricter filters, the system should pick up on that pattern. So we built an internal API around a simple idea: user corrections are the highest-signal data. Instead of ingesting chat messages and hoping an LLM extracts something, we capture structured events — what the agent produced, what the user changed, what they accepted. A background job periodically runs an LLM pass to extract patterns and builds a confidence-weighted preference profile per user/team/org. Before each session, the agent fetches the profile and gets smarter over time. The gap as I see it: Mem0 = memory storage + retrieval. Doesn't learn patterns. Letta = self-editing agent memory. Closer, but no implicit learning from behavior. Missing = a preference learning layer that watches how users interact with agents and builds an evolving model. Like a rec engine for agent personalization. I built this for our domain but the approach is domain-agnostic. Curious if others are hitting the same wall with their agents. Happy to share the architecture, prompts, and confidence scoring approach in detail.
New Show Hacker News story: Show HN: I built an AI twin recruiters can interview
Show HN: I built an AI twin recruiters can interview
2 by Charlie112 | 3 comments on Hacker News.
https://chengai.me The problem: Hiring new grads is broken. Thousands of identical resumes, but we're all different people. Understanding someone takes time - assessments, phone screens, multiple interviews. Most never get truly seen. I didn't want to be just another PDF. So I built an AI twin that recruiters can actually interview. What you can do: •Interview my AI about anything: https://chengai.me/chat •Paste your JD to see if we match: https://ift.tt/XdNZ4Gq •Explore my projects, code, and writing What happened: Sent it to one recruiter on LinkedIn. Next day, traffic spiked as it spread internally. Got interview invites within 24 hours. The bigger vision: What if this became standard? Instead of resume spam → keyword screening → interview rounds that still miss good fits, let recruiter AI talk to candidate AI for deep discovery. Build a platform where anyone can create their AI twin for genuine matching. I'm seeking Software/AI/ML Engineering roles and can build production-ready solutions from scratch. The site itself proves what I can do. Would love HN's thoughts on both the execution and the vision.
2 by Charlie112 | 3 comments on Hacker News.
https://chengai.me The problem: Hiring new grads is broken. Thousands of identical resumes, but we're all different people. Understanding someone takes time - assessments, phone screens, multiple interviews. Most never get truly seen. I didn't want to be just another PDF. So I built an AI twin that recruiters can actually interview. What you can do: •Interview my AI about anything: https://chengai.me/chat •Paste your JD to see if we match: https://ift.tt/XdNZ4Gq •Explore my projects, code, and writing What happened: Sent it to one recruiter on LinkedIn. Next day, traffic spiked as it spread internally. Got interview invites within 24 hours. The bigger vision: What if this became standard? Instead of resume spam → keyword screening → interview rounds that still miss good fits, let recruiter AI talk to candidate AI for deep discovery. Build a platform where anyone can create their AI twin for genuine matching. I'm seeking Software/AI/ML Engineering roles and can build production-ready solutions from scratch. The site itself proves what I can do. Would love HN's thoughts on both the execution and the vision.