Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
CMind: An AI Agent for Localizing C Memory Bugs (arxiv.org)
1 point by PaulHoule 29 minutes ago | past | discuss
A benchmark for vericoding: formally verified program synthesis (arxiv.org)
1 point by luskira 54 minutes ago | past | discuss
Computing Diffusion Geometry (arxiv.org)
2 points by aanet 1 hour ago | past | 1 comment
Constructing Unlearnable Data with Solely Linear Classifiers (arxiv.org)
1 point by PaulHoule 2 hours ago | past | discuss
Experiential Reinforcement Learning (arxiv.org)
3 points by geophile 4 hours ago | past | discuss
Identity, Cooperation and Framing Within Groups of Real and Simulated Humans (arxiv.org)
2 points by PaulHoule 5 hours ago | past | discuss
Investigating the Downstream Effect of AI Assistants on Software Maintainability (arxiv.org)
2 points by KallDrexx 5 hours ago | past | 2 comments
A New Perspective on Drawing Venn Diagrams for Data Visualization (arxiv.org)
2 points by IdealeZahlen 5 hours ago | past | discuss
Language Models Entangle Language and Culture (arxiv.org)
1 point by paraschopra 5 hours ago | past | discuss
Does Socialization Emerge in AI Agent Society? A Case Study of Moltbook (arxiv.org)
1 point by simonpure 6 hours ago | past | 1 comment
Biases in the Blind Spot: Detecting What LLMs Fail to Mention (arxiv.org)
2 points by azalemeth 12 hours ago | past | discuss
Prompt Repetition Improves Non Reasoning LLM (arxiv.org)
1 point by jdthedisciple 12 hours ago | past | discuss
GLM-5 Technical Report (arxiv.org)
9 points by meetpateltech 16 hours ago | past | discuss
Training-Free Group Relative Policy Optimization (arxiv.org)
1 point by readitalready 23 hours ago | past | discuss
Composition-RL: Compose Verifiable Prompts for Reinforcement Learning of LLMs (arxiv.org)
3 points by gmays 1 day ago | past | discuss
Reducing the cost of breaking RSA-2048 to 100000 physical qubits (arxiv.org)
3 points by fuglede_ 1 day ago | past | discuss
Intelligent AI Delegation (arxiv.org)
2 points by gmays 1 day ago | past | discuss
Randomness in Agentic Evals (arxiv.org)
1 point by andre15silva 1 day ago | past | discuss
Hunt Globally (arxiv.org)
1 point by salkahfi 1 day ago | past | discuss
Frontier Models Exhibit Sophisticated Reasoning in Simulated Nuclear Crises (arxiv.org)
1 point by salkahfi 1 day ago | past | discuss
Learning State-Tracking from Code Using Linear RNNs (arxiv.org)
2 points by jul8234 1 day ago | past | 1 comment
A Survey of In-Context Reinforcement Learning (arxiv.org)
2 points by handfuloflight 1 day ago | past | discuss
Soft Contamination Means Benchmarks Test Shallow Generalization (arxiv.org)
2 points by cjbarber 1 day ago | past | 1 comment
SkillsBench: Benchmarking how well agent skills work across diverse tasks (arxiv.org)
358 points by mustaphah 1 day ago | past | 162 comments
Virtual Width Networks (VWN) (arxiv.org)
8 points by tesserato 1 day ago | past | 1 comment
CodeLogician: Neuro-symbolic reasoning for precise software analysis (arxiv.org)
2 points by NTCTech 2 days ago | past | 1 comment
Intelligent AI Delegation (2026) (arxiv.org)
1 point by Nydhal 2 days ago | past | discuss
Delegated Agent Authorization Constrained to Semantic Task-to-Scope Matching (arxiv.org)
1 point by mooreds 2 days ago | past | discuss
Evaluating AGENTS.md: are they helpful for coding agents? (arxiv.org)
199 points by mustaphah 2 days ago | past | 155 comments
Multi-Agent Teams Hold Experts Back (arxiv.org)
1 point by fauigerzigerk 2 days ago | past | discuss

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: