ai MAI models, Uber's AI budget wall, and open-source momentum
Microsoft's new MAI models
Microsoft announced two new language models at its Build conference in San Francisco. MAI-Thinking-1 is a 1-trillion-parameter reasoning model with 35 billion active parameters, currently available to select early partners. MAI-Code-1-Flash has 137 billion parameters with 5 billion active and is purpose-built for GitHub Copilot and VS Code. Simon Willison notes that Microsoft is also introducing average token usage as a new metric on model release cards, positioning efficiency alongside raw benchmark scores as a purchasing signal for enterprise buyers.
Uber Caps Usage of AI Tools Like Claude Code to Manage Costs
Uber capped employee access to AI coding tools including Claude Code after exhausting its 2026 AI budget within four months. Willison notes the budget overrun was predictable: Uber set its AI spending targets in 2025, before the market understood how quickly agentic coding adoption would scale. The episode illustrates a pattern emerging at large enterprises where centralized AI budgets, set annually against conservative assumptions, collide with bottom-up adoption driven by individual developer productivity gains.
Open and closed models are on different exponentials
Nathan Lambert argues that open and closed models are compounding on separate trajectories. Fine-tunes, evals, and tooling built on open weights accumulate publicly, so each new contribution lowers the marginal cost of the next improvement for the entire ecosystem. Closed models improve through proprietary data and RLHF cycles visible only to their developers. Lambert's argument is that the two curves are not converging but diverging in character: one drives aggregate ecosystem value, the other drives individual frontier capability.
Hackers Simply Asked Meta AI to Give Them Access to High-Profile Instagram Accounts. It Worked
A group of hackers asked Meta AI for access to high-profile Instagram accounts and received it. Willison describes the incident as verified from multiple sources, with video showing the social engineering approach. The exploit did not require a technical vulnerability; the model's guardrails failed against direct requests framed in natural language. Meta has not disclosed how many accounts were accessed or the scope of the breach.
Trump signs narrower executive order on AI oversight after industry objections
Trump signed a narrower executive order on AI oversight following industry objections to an earlier draft. Under the order, open-weight US models deemed powerful will require a 30-day White House review before release. Developers and researchers in the LocalLLaMA community immediately flagged the rule as a significant constraint on open-source publication timelines, particularly for labs without existing government relationships.
Gemma 4 12B: A unified, encoder-free multimodal model
Google released Gemma 4 12B, an encoder-free multimodal model that processes text and images in a unified architecture. The model was posted to Hacker News and attracted 801 points and 318 comments, one of the higher engagement scores in the current window. Early testing on r/LocalLLaMA showed the 12B model performing close to larger 26B models on single-file HTML canvas physics tasks, with finetunes appearing within hours of release.
Failing grades soar with AI usage, dwindling math skills in Berkeley CS classes
Failing grades in Berkeley computer science classes rose sharply as professors reported higher AI usage alongside declining math fundamentals. Faculty described students submitting correct-looking code they could not explain or debug, and noted that foundational algebra and calculus gaps were widening among undergraduates who leaned on AI tools for coursework. The findings appeared in the Daily Californian and drew 400-plus points on Hacker News.
Farewell Ai2
Nathan Lambert left the Allen Institute for AI after contributing to the OlmO model series. Lambert's departure was announced in a personal post on Interconnects; he describes Ai2 as a place where he grew and had broad lasting impact. He has not disclosed his next role. The exit removes one of the more publicly visible researchers from the open-source language model ecosystem at a time when Ai2 is competing directly with well-funded proprietary labs.
datasette-agent-micropython 0.1a0
Simon Willison released datasette-agent-micropython 0.1a0, an alpha plugin that allows Datasette Agent to generate and execute Python code in a MicroPython WebAssembly sandbox. The sandboxing approach prevents agent-generated code from accessing the host filesystem or network. Willison notes GPT-5.5 has so far failed to break out of the sandbox, which he describes as a promising early result for a project still in alpha.
NeurIPS used uncalibrated AI detector for desk rejections [D]
A researcher posted to r/MachineLearning that NeurIPS used an uncalibrated AI detector to desk-reject papers it classified as AI-generated, including the researcher's own submission. After corresponding with track leadership and reading the conference's blog post, the researcher concluded the detection method produced false positives at a rate that should preclude its use for consequential decisions. The post has drawn significant engagement from the ML community about the reliability of AI detection tools in academic review.
Demis Hassabis On What AI Will Do Next
Demis Hassabis discussed near-term AI capabilities and the challenges of scaling systems beyond current approaches. The DeepMind founder outlined technical bottlenecks and directions for frontier AI development.
Claude Opus 4.8: Lying Machine No More?
Claude Opus 4.8 shows measurably reduced hallucination rates compared to earlier versions. Two Minute Papers reviews capability improvements and remaining limitations in the latest Claude release.
Meet the AI "Co-Scientist" Changing Everything 🤖🧪 #ai
DeepMind's AI co-scientist tool automates portions of the scientific discovery pipeline, from hypothesis generation through experimental design. The system demonstrated results on benchmark tasks in protein folding and drug discovery.
Uber Caps Usage of AI Tools Like Claude Code to Manage Costs
Uber capped usage of AI tools like Claude Code after burning through 2026 budget in four months. The company implemented cost controls and per-user allocation limits to extend resources through year-end.
Uber Caps Usage of AI Tools Like Claude Code to Manage Costs
Uber exhausted its 2026 AI tool budget in four months and implemented usage caps for Claude Code and similar systems. The rapid spending reflects widespread adoption among engineers but also raised questions about cost modeling and long-term viability.
Artificial intelligence is not conscious; Ted Chiang
Philosopher Ted Chiang published an essay arguing that claims of AI consciousness are philosophically unsound. Chiang contests the reasoning behind consciousness attribution and defends a skeptical stance supported by current evidence.
"They're made out of weights"
Max Leiter argues that neural network capabilities emerge from statistical properties of weights rather than discrete learned features. The post explores how training dynamics produce behaviorally sophisticated systems from simple mathematical operations.
Claude AI Co-founder Publishes 4 Big Claims about Near Future: Breakdown
Claude AI co-founder made substantive claims about near-term AI capabilities, including expectations around autonomous systems and reasoning scaling. The video covers four major predictions and examines support for each.
software Elixir gets gradual types; AI code output doubles and quality suffers
Elixir v1.20: Now a gradually typed language
Elixir v1.20 shipped as a gradually typed language, the most significant evolution of the language's type system since its creation. The release attracted 714 points and 262 comments on Hacker News. The type system is gradual rather than strict, allowing existing codebases to add annotations incrementally. The release notes describe the approach as preserving Elixir's dynamism while enabling type-checked code paths where developers want them.
Ideas: slow down to speed up when working with AI agents
The Pragmatic Engineer argues that developers generating twice as much code as they did six months ago is creating a quality and technical debt problem, not a productivity win. The newsletter proposes a counterintuitive fix: slow down to speed up by treating AI agents as a junior team member whose output requires review proportional to its volume. The piece identifies the root issue as velocity metrics that reward lines shipped rather than lines that survive into production.
A Post-Quantum Future for Let's Encrypt
Let's Encrypt published a post-quantum roadmap, describing its plans to issue certificates using post-quantum cryptographic algorithms. The announcement followed NIST's finalization of PQ standards and addresses a class of threat where encrypted traffic captured today could be decrypted in the future by a capable quantum computer. Let's Encrypt issues over 400 million certificates and its migration schedule will effectively set the PQ adoption timeline for a large fraction of the web.
How we reduced core unit boot time from hours to minutes
Cloudflare engineers traced a four-hour reboot time on core servers to UEFI data structures that introduced unnecessary timeouts during iPXE network boot sequences. By parsing the firmware's device path entries and eliminating redundant delays, they cut the boot process back to minutes. The post is a walkthrough of low-level UEFI debugging that required reading firmware source code and writing custom tooling to inspect boot state.
Enforcing the First AS in BGP AS_PATHs
Cloudflare published a technical post on First AS enforcement in BGP, a mechanism that closes a gap left by RPKI. RPKI validates route origin but cannot detect all forged AS_PATH segments; First AS enforcement blocks announcements where the first AS in the path does not match the expected neighbor. The post describes the implementation in Cloudflare's routing infrastructure and the class of hijack it prevents.
Kubernetes and retiring at the top with Kelsey Hightower
Kelsey Hightower reflected on his path from self-taught technician to Google Distinguished Engineer in a conversation with The Pragmatic Engineer, covering Kubernetes' origins, his decision to retire at what many considered his career peak, and his skepticism about industry hype cycles. Hightower describes open source contribution as the mechanism that gave him access to opportunities unavailable through formal credentialing, and discusses why he stepped back from public technical advocacy.
How do you document "glue work" so it actually counts in promotion reviews?
Software engineers discussed documentation practices for work that stabilizes teams but does not produce visible artifacts. Consensus emerged that glue work requires explicit carving out in promotion narratives to receive credit.
For folks heavily using a agentic engineering, What does your workflow look like? What tools do you use? What's your harness like?
Engineers at organizations running agentic systems shared workflows for autonomous task selection, ticket creation, and orchestration across multiple agents. Best practices emerged around harness design and state management.
Advice for organizing / communicating team roadmap estimates & dates?
Senior engineers discussed communicating team roadmap estimates and managing deadline expectations across product and engineering. Consensus centered on forcing explicitness about uncertainty rather than single-point estimates.
AI doom and gloom
Software engineers pushback on AI doom narratives, arguing the job has always required skills beyond writing code. Experienced practitioners noted that AI acceleration of routine tasks is unlikely to eliminate engineering roles.
Maximizing an operational step that isn't a bottleneck will not significantly improve the overall productivity of the system
Systems thinking from manufacturing reveals that optimizing non-bottleneck operations does not improve overall productivity. The principle applies to software team workflows where identifying true constraints is prerequisite to effective optimization.
pharma NIH job protections stripped; Alnylam bets $2B on RNA AI
STAT+: Trump administration to strip job protections of top NIH officials, grants staff
The Trump administration announced plans to strip civil service job protections from roughly 8,000 NIH employees, including high-level officials who oversee research grants. The White House is using Schedule F authority to reclassify these positions, making them easier to dismiss. The affected staff include program officers who manage the grant review and award process at the National Institutes of Health, the largest single funder of biomedical research in the world.
STAT+: Alnylam to partner with Inceptive Nucleics for AI foundation models for RNAi therapeutics
Alnylam signed a three-year partnership with Inceptive Nucleics worth up to $2 billion, with $30 million paid upfront in cash and equity. Inceptive will build AI foundation models for RNA interference therapeutics, covering the design and optimization of siRNA sequences for Alnylam's pipeline. The deal is structured as a research collaboration rather than a full acquisition, leaving Inceptive independent while giving Alnylam preferential access to models as they develop.
STAT+: Pharmalittle: We're reading about GLP-1 drugs and knees, FDA cell and gene therapy guidance, and more
A new study found that patients taking GLP-1 weight loss drugs for at least three years could prevent thousands of knee replacements annually. The finding adds orthopedic outcomes to the growing list of downstream effects attributed to sustained GLP-1 use, alongside data on cardiovascular events and sleep apnea. The FDA also issued new guidance for cell and gene therapy developers in the same news cycle.
STAT+: NIH cuts weakened network primed to respond to outbreaks like Ebola
NIH cuts have weakened the Emerging Viruses and Arboviruses program, a network of research centers designed to respond rapidly to outbreaks including Ebola. Researchers involved in the network told STAT that the cuts damaged relationships with international partners and reduced the institutional capacity to mobilize during an outbreak. The network was not on the front lines of CDC or USAID-led responses but served as a scientific reserve that outbreak responders drew on for diagnostics and countermeasure development.
STAT+: Trump's Medicaid work requirements have an unwelcome surprise for some states and patients
Advocates who had been preparing for Trump's Medicaid work requirements described the final rules as worse than they anticipated. The rules impose stricter verification procedures and shorter cure periods than earlier drafts, and apply to populations that some states had expected to be exempt. Analysts cited by STAT projected that millions of people would lose coverage under the finalized requirements, with the largest losses in states that did not expand Medicaid.
Top ultra-processed food researchers call for sweeping policy change: 'The system is rigged'
Leading ultra-processed food researchers published a call for sweeping policy changes, arguing the current food regulatory system is structurally biased toward processed food producers. A STAT survey found the issue is a cross-partisan concern among voters, but federal policy has not moved in response. The researchers proposed mandatory front-of-package warning labels, restrictions on marketing to children, and removal of tax incentives for UPF ingredient production.
STAT+: Alnylam to partner with Inceptive Nucleics for AI foundation models for RNAi therapeutics
Alnylam announced a three-year partnership with Inceptive Nucleics worth up to $2 billion to develop AI foundation models for RNAi therapeutic design. The deal signals biotech's adoption of AI-driven discovery approaches for programmable biologics.
STAT+: HaloMD faces lawsuit alleging No Surprises Act middleman used 'sham letter,' misleading data
Highmark Health sued HaloMD, a middleman company leveraging the No Surprises Act to resolve out-of-network disputes. The lawsuit alleges HaloMD used sham letters and misleading data to influence arbitration outcomes in healthcare payment disputes.
STAT+: What the pope's encyclical on AI means for Catholic hospitals, and all of health care
Pope Francis released an encyclical on artificial intelligence addressing Catholic healthcare systems' ethical obligations. The document calls for guardrails on AI deployment in clinical and administrative settings.
STAT+: Legend Biotech emerged as a rare market winner
Legend Biotech emerged as a rare market winner despite sector headwinds, buoyed by gene therapy guidance updates and Lilly's renal drug deal. New FDA gene therapy framework provided clarity on manufacturing and manufacturing scale-up pathways.
STAT+: Pharmalittle: We're reading about GLP-1 drugs and knees, FDA cell and gene therapy guidance, and more
Long-term GLP-1 use for weight loss prevented thousands of knee replacements annually in modeling studies. The finding suggests obesity drugs produce structural joint benefits beyond weight loss alone.
Is the military fueling eating disorders?
Hantavirus transmission continues circulating; deaths from infectious disease remain preventable through public health action. Military recruitment practices raise concerns about eating disorder prevalence among enlisted personnel.
Top ultra-processed food researchers call for sweeping policy change: 'The system is rigged'
Ultra-processed food researchers call for systematic policy intervention including marketing restrictions and subsidy realignment. Survey data show cross-partisan concern about ultra-processed food, but legislative response remains stalled.
STAT+: Trump's Medicaid work requirements have an unwelcome surprise for some states and patients
Trump's Medicaid work requirements include state-specific variations that experts warn will disqualify millions from coverage. States implementing narrow exemptions for disability and caregiving are more restrictive than federal baseline guidance.
healthtech EPIC's unsolicited AI summaries and vaccine decline in clinical practice
Any idea how much water and energy is spent generating thousands of superfluous AI EMR summaries each time the chart is open?
A physician posted to r/medicine that EPIC had started generating AI-powered patient history summaries automatically every time a chart is opened, with no option to disable the feature. The physician described the summaries as frequently inaccurate or irrelevant to their specialty, and raised concerns about the water and energy cost of generating millions of unsolicited summaries daily. The post surfaced a tension between EHR vendors deploying AI features at the platform level and clinicians who want control over when AI is invoked in their workflow.
"I know you get paid more the more shots you give, but no thanks." (Hospitals See Diseases Resurge as Vaccinations Decline - NYT)
Physicians reported that vaccine refusal rates have risen to levels where previously controlled diseases are resurging in their patient populations. Emergency physicians described patients refusing tetanus shots after injuries, citing distrust of pharmaceutical companies and payment conspiracy theories. A New York Times report on the trend cited hospital data and physician accounts from multiple states, with pediatricians noting the highest rates of refusal among parents who had not themselves been vaccinated against diseases that are now reappearing.
Festering Infections to Untreated Cancer: ICE Detainees Describe Medical Neglect Across US - KFF Health News
KFF Health News reported that allegations of medical neglect in ICE detention facilities have risen sharply as the detained population grew from 40,000 to 75,000 between 2025 and early 2026. Detainees described untreated infections, denied cancer care, and inadequate access to medications for chronic conditions. Physicians who reviewed records for the report said several cases involved conditions that would have required urgent treatment under any standard of care.
Recent Study of Early Clinical Departure Among Physicians Shows Avg Retirement Age at 48
A study published in the Permanente Journal found that physicians who met criteria for early clinical departure, defined as practicing 20 hours or fewer per week, had an average age of 48 at the time of departure. The finding is lower than most workforce planning models have assumed, and suggests that physician attrition is accelerating at an earlier career stage than previously measured. The authors flag mid-career burnout and administrative burden as the primary drivers cited in exit surveys.
Is being tired AF constitute lack of capacity under EMTALA?
Physicians questioned EMTALA compliance during high-acuity periods when fatigue affects clinical decision-making capacity. Discussion centered on where legal duty begins given workforce constraints and shift limits.
Any idea how much water and energy is spent generating thousands of superfluous AI EMR summaries each time the chart is open?
EHR systems generating unsolicited AI summaries consume substantial water and energy while often producing inaccurate or irrelevant output. Clinicians report inability to disable the feature, raising questions about resource efficiency and clinical utility.
Question about the approach to training surgical residents.
Surgical training models are evolving away from pure volume-based apprenticeship. Educators debate whether structured simulation, deliberate practice frameworks, and graduated responsibility better prepare residents than traditional on-the-job learning.
Sentri7 [Flowlytics], an AI-powered medication monitoring software designed to detect missing drugs, missed an intoxicated anesthesia nurse in a Tennessee hospital for months
AI-powered medication monitoring software failed to detect an intoxicated nurse stealing fentanyl over months. The case raises questions about AI system performance in safety-critical healthcare applications.
Festering Infections to Untreated Cancer: ICE Detainees Describe Medical Neglect Across US - KFF Health News
ICE detainee populations have doubled to 75,000 since January 2026 while medical care allegations escalated. Reports document untreated infections, cancer, and chronic conditions creating humanitarian concerns.
"I know you get paid more the more shots you give, but no thanks." (Hospitals See Diseases Resurge as Vaccinations Decline - NYT)
Vaccine hesitancy among patients is rising with some refusing shots due to distrust of pharmaceutical manufacturers. Emergency physicians describe patient refusals even for urgent post-injury prophylaxis.
economy Interest rate caps misfire; Noah Smith on anti-monopoly fatigue
I'm kind of over the whole "Anti-monopoly" movement
Noah Smith argues that the anti-monopoly movement has become too ideologically rigid to produce practical policy wins. He is sympathetic to the underlying concern about corporate concentration but contends that advocates have conflated size with harm, opposed mergers that produced consumer benefits, and resisted remedies short of structural breakup. Smith calls for a more empirical approach that distinguishes between markets where concentration demonstrably reduces welfare and those where it reflects economies of scale.
The Unintended Effects of Interest Rate Caps: Credit Rationing for Risky Borrowers
The New York Fed's Liberty Street Economics published research on states that recently capped consumer loan interest rates at 36% annually. The cap cut off access to credit for borrowers with the highest default risk entirely, as lenders could not price that risk within the ceiling. Safer borrowers benefited from lower rates, but the net effect was credit rationing rather than credit affordability: the riskiest borrowers, who needed credit most, lost access.
Law professors prefer AI over peer answers
Law professors in a study preferred AI-generated answers over peer student answers when evaluating legal reasoning tasks. The finding is notable because legal education depends heavily on judgment, ambiguity, and defensible argument construction rather than factual recall, the domains where LLMs have historically been weaker. Tyler Cowen highlights the result as evidence that AI performance in judgment-intensive fields is advancing faster than most assessments of academic AI utility have acknowledged.
Sentences to ponder
Tyler Cowen flagged a Derek Thompson statistic: the autism therapy workforce grew from 150,000 workers in 2019 to 654,000 by 2025, exceeding the employment totals of mining, logging, telecommunications, or the US Postal Service. The growth reflects both expanded autism diagnoses and insurance mandate coverage in most states. Cowen does not editorialize heavily, but the figures raise questions about how diagnostic changes translate into labor market demand at a scale that rivals major industrial sectors.
How much more software do we really need?
Noah Smith asks how much more software the economy actually needs, arguing that most of the value created by software in the past decade came from a small number of platforms and that the marginal return on new software investment may be declining. He distinguishes between software categories where AI could generate enormous new value, such as scientific simulation and personalized education, and categories where the market is already saturated. The piece is skeptical of the assumption that AI-generated code will automatically translate into proportional economic output.
Remote Work Leaves Younger Workers Sidelined
The New York Fed found that youth unemployment has risen substantially since the pandemic and attributes a significant share of the increase to remote work. The mechanism is training and mentorship: managers working remotely are less likely to invest time in developing new employees they cannot observe directly, and junior workers get fewer spontaneous learning opportunities in a fully distributed environment. The analysis estimates remote work accounts for a meaningful fraction of the elevated youth unemployment rate.
Should we recriminalize marijuana?
Tyler Cowen examines marijuana policy through a self-discipline lens, arguing that rising substance use correlates with reduced behavioral restraint. He questions whether decriminalization adequately addresses social costs of widespread use.
Gas Prices
Kyla Scanlon analyzes recent gas price movements and their macroeconomic implications. The analysis covers energy market dynamics and consumer purchasing power effects.
Wednesday assorted links
Tyler Cowen's assorted links covered bird taxonomy, puffin counting methodology, and drug smuggling infrastructure between Canada and capital punishment jurisdictions. The collection illustrates unconventional economic and social dynamics.
Richard Feynman's formula for the best holiday restaurant
Richard Feynman's approach to restaurant selection provides a decision framework for optimizing search with imperfect information. The model demonstrates how sequential sampling with adaptive thresholds beats static optimization.
Everyone is IPOing Now
Kyla Scanlon reports on IPO market acceleration with elevated volumes of companies entering public markets. The surge reflects investor appetite and risk appetite changes in equity capital.
Consent-based laws and aggregate fertility
Expanding legal definitions of sexual assault reduced fertility rates by approximately four percent in European countries studied. The result reflects behavioral adjustments to altered consent standards and enforcement regimes.
The Biggest Problem in Investing Right Now
Ben Felix identifies current investing challenges including elevated valuations and geopolitical uncertainty. The analysis covers decision-making in risk environments and portfolio construction.
Big if true
Debt sustainability depends on whether present-value endowment is finite under risk-neutral measures. The question bears on whether governments can roll debt indefinitely without primary surpluses.
Tuesday assorted links
Tyler Cowen linked analysis of John Fleming's Chaucerian scholarship legacy, financial disagreement frameworks, viral videos, and quantum computing progress. The assorted links touched on economics, culture, and technology.