AI in Battery Manufacturing Was Never a Switch

AI in battery manufacturing is not one thing you switch on, and most of it is not new. Strip the marketing and the bulk of it is machine learning that has run on production lines for over a decade. What actually changed is the reasoning layer: a brain that sits beside the factory, not on top of it.

AI in battery manufacturing is not a single capability you either have or you don't. Most of what gets called AI is machine learning and predictive modeling, and that work is decades old. What is genuinely recent is generative AI, and newer still are agents: language models given tools, memory, and domain skills. The architecture has five pillars, and a reasoning brain reaches into four of them. It never sits on top.

Why people think AI is a switch you flip

Ask most teams whether they "do AI" in their cell plant and you get a yes or a no. That framing is the first mistake. AI in manufacturing is not one capability that arrives in a box. It is a set of layers that have grown at very different speeds, some of them for thirty years.

The second mistake is the picture people draw. Sensors at the bottom, data above them, dashboards above that, and AI as a thin band painted across the top. That pyramid is wrong, and it is wrong in a way that costs money. Scrap rates of 15 to 30 percent are common in the first years of cell production, and even after five years reject rates sit near 10 percent, with each point of scrap costing a gigafactory on the order of 10 million euros a year^[1]. A layer that only sits on top of that problem cannot reach down far enough to fix it.

The common pyramid model of AI in battery manufacturing: sensors at the wide base, then data, then dashboards, with AI as a thin band at the narrow top. The mental model this article argues against. — The common pyramid model, and the mistake in it. AI painted as a thin band on top cannot reach down into the layers where scrap is actually made.

"AI" is mostly a name for machine learning, and machine learning is not new

Strip the marketing language off most battery AI and what is left is machine learning and predictive modeling. That is not a criticism. It is accurate, and it matters because it tells you what is actually new and what is not.

Predictive modeling in batteries goes back decades. The pseudo-two-dimensional model, or P2D, from Doyle, Fuller, and Newman describes the charge and discharge of a lithium cell from concentrated solution theory, and it remains the backbone of electrochemical simulation more than thirty years after publication^[2]. The data-driven side matured years ago too. Severson and colleagues showed that discharge voltage curves from the first 100 cycles, before any capacity fade is visible, predict cycle life with 9.1 percent test error, and that the first 5 cycles alone sort cells into low- and high-lifetime groups with 4.9 percent error, across 124 commercial LFP cells whose lives ran from 150 to 2,300 cycles^[3].

So a plant does not become AI-empowered the moment it runs a machine learning model. The companies at the front of this field have run these models quietly for years. The vocabulary changed faster than the substance did.

What is actually new: generative AI, and then agents

The genuinely recent shift is generative AI applied to battery and materials work, and it is only a few years old. Large language models can read a deviation report, a process spec, and a batch record and reason across them in language, which the predictive models above could never do.

Newer still, and more important, are agents. An agent is a language model given a set of appendages that turn a passive chatbot into something that acts. Tools, so it can call a model, query a database, or run code. Knowledge, so it carries domain context. Skills, so it follows the procedures an experienced engineer would. Memory, so it remembers what it found on the last run. Those appendages are the difference between a model that answers a question and a system that runs an investigation.

You have run the machine-learning part for years. The new question is whether you build the reasoning and agentic layer that sits beside everything else and puts the rest to work.

What the AI architecture for battery manufacturing actually looks like

The right picture is not a pyramid. It is five pillars. Four of them are layers the reasoning brain reaches into, each with a defined access mode, and the fifth is the brain itself. Two layers the brain may only read. Two it reads and writes. The order below follows how deep its responsibility runs, from the layer it merely observes to the layer it actively builds.

The AI architecture for battery manufacturing: four stacked layers on the left (predictive and models, data foundation, systems of record, physical and control) connected to a reasoning and agentic brain on the right. The brain reads and writes the top two layers and reads only the bottom two. — The architecture this article argues for: four layers and one reasoning brain beside the factory. Read and write on the data foundation and the models; read-only on the systems of record and the physical floor.

Pillar one: the physical and control layer, which the brain observes but never touches

This is the deterministic, hard real-time, safety-critical layer. PLCs, SCADA, sensors, machine vision cameras, and equipment controllers. A coater's web-tension loop or a welder's interlock has to respond inside a bounded time, every time, or product and people are at risk. The brain ingests telemetry and equipment state from this layer. It does not actuate.

Read-only is the correct mode here, and not as a temporary concession. A probabilistic, variable-latency model cannot guarantee a response inside a fixed window, so actuation stays with certified deterministic control. Even reading this layer well is real work: OT network integration, protocol gateways for OPC UA and Modbus, edge connectivity, and data engineering that handles high-frequency signals without aliasing them. The data volume is enormous. Machine vision already inspects the full electrode web continuously while a human inspector spot-checks roughly 5 percent of it, so 100 percent coverage exists at the sensor, if you can pull the data out cleanly.

Pillar two: the systems of record, the ledger the brain reads but never rewrites

The systems of record are the factory's ledger. MES, ERP, LIMS, QMS, and data historians such as OSIsoft PI and AVEVA. This is where cell genealogy, bill of materials, work in progress, and quality state live. The brain reads all of it. It does not write back.

The ledger has to stay auditable, single-source, and vendor-neutral, which is precisely why the brain reads but never edits it. The customer's source of truth is not the agent's to rewrite. The vendor landscape here is mature, with Siemens Opcenter, SAP, Critical Manufacturing, and Tulip among the systems battery plants actually run. The limit today is heterogeneity. Genealogy schemas differ from plant to plant, latency between systems is uneven, and the data sits in silos that make a clean unified read harder than vendors admit.

Pillar three: the data foundation, the substrate the brain reads and writes

Here the access mode changes. The data foundation is the substrate that connects materials, process, metrology, electrochemical, and field data in one place. A data lake or lakehouse, a time-series database, the pipelines that feed them, a feature store, and the ontology or knowledge graph that gives the data shared meaning. The brain reads everything here, and it writes: derived signals, annotations, materials-to-process correlations, and the knowledge graph it builds while reasoning over the plant's history.

Read and write is right because the brain is the layer that benefits from owning this end to end. A correlation it finds between a slurry viscosity drift and a downstream coating defect is worth nothing if it cannot be stored, versioned, and reused. Building the substrate is the heaviest infrastructure lift in this list. The limit today is mess. Schema drift, integration debt, and inconsistent units across decades-old equipment make the substrate hard to keep clean, and a dirty substrate poisons every model above it.

Pillar four: the predictive and simulation layer, the models the brain calls, trains, and stores

This layer is machine learning plus electrochemistry, now treated as tools the brain can call. State of health and remaining-useful-life prediction, defect and yield models, physics-based electrochemical models, physics-informed neural networks, and digital twins. The brain calls these models, triggers retraining when drift appears, runs simulations, and writes new models and results back into the substrate.

Separate the models that run inside the product from the ones used in the plant. State of charge and state of health estimation run on the battery management system after the cell ships. The manufacturing-side models map process to quality and predict outcomes early, before a cell finishes formation. The two traditions have started to merge. Hofmann and colleagues fused a P2D model with lab and field data to train a physics-informed neural network that estimates state of health with a root mean squared error below 2 percent on synthetic data and near 3 percent on lab data^[4], and Wang and colleagues reported a physics-informed network that holds up across chemistries and operating conditions where purely data-driven models tend to fail^[5]. The limit today is generalization. A model tuned on one chemistry and one line rarely transfers cleanly to the next.

Pillar five: the reasoning and agentic layer, the brain the engineer actually talks to

Beside these four layers sits the brain. Large language models and agents that connect materials to process to properties to outcomes, design DOEs, run root cause analysis, troubleshoot deviations, and carry a process from an R&D line to a gigafactory. The point is that the engineer talks to one thing. The brain translates intent into calls across the other four layers, reading the historian, querying the substrate, invoking a cycle-life model, and writing the result back, without the engineer stitching five tools together by hand. Niobia is built to be this layer, grounded in real electrode manufacturing rather than generic analytics, so a question like why defect rate climbed on line two last shift becomes a structured investigation instead of a dashboard hunt.

Why the access boundary is the whole design

Read-only on the ledger and the floor. Read and write on the substrate and the models. This is not a limitation to apologize for. It is what makes the architecture defensible, for three reasons.

Safety. Functional safety frameworks define Safety Integrity Levels and bounded-time response for the systems that protect equipment and people, in the foundational standard IEC 61508^[6] and its process-sector form IEC 61511^[7]. A language-model-driven agent cannot carry a Safety Integrity Level, because its latency and outputs are probabilistic by construction, so it does not actuate and the PLC stays sovereign over the floor.

Auditability. A serious manufacturer needs the ledger to be auditable, vendor-neutral, and single-source, and cell genealogy is the one record a battery maker cannot afford to corrupt. The same logic governs the IT and OT boundary that IEC 62443 exists to protect^[8]. And trust. Tell a plant manager honestly that the agent never actuates equipment and never rewrites the system of record, and the largest fear about agentic systems disappears in one sentence.

When the brain starts building the models on demand

Here is where the architecture starts to move. Today the predictive layer is a fixed library of models the brain calls. As inference gets cheaper and agents get more capable, that relationship inverts. The brain stops only calling pre-built models and starts building them on demand: writing the feature engineering, fitting a predictor for the question in front of it, validating it, and discarding it when the question is answered. The predictive layer becomes something the brain generates, not just a shelf it reaches into.

The substrate moves the same way. The data foundation and the models fold into the brain over time, because it is the only consumer that benefits from owning them end to end. The pattern repeats: SQL was the top layer until BI tools sat on it, BI tools were the top until the agent sat on them, and the substrate and the models are next.

The substrate and the models will fold into the agent. The PLC and the MES will not, by design. Some separations are permanent, and the permanence is the point.

The brain at the edge

Push the idea one more step. The reasoning layer does not have to live in a cloud application. It can run on the testers, the cyclers, the formation equipment, and the metrology tools themselves, reading telemetry the moment it comes off the tool, flagging an anomaly while the cell is still on the line, and proposing the next DOE point before the engineer has finished their coffee.

The read-only boundary travels with it. An agent at the edge still observes and recommends, and actuation still stays with the deterministic controller on the instrument. This is reachable now, not in a decade. Small models have become capable enough for the structured work an edge agent does. MobileLLM, a sub-billion-parameter family from Liu and colleagues, matches a 7-billion-parameter LLaMA-2 model on API-calling tasks, the exact call-a-tool, parse-a-result pattern an instrument agent runs^[9]. And agent frameworks already direct real science: Coscientist, a GPT-4-driven system from Boiko and colleagues, autonomously designed, planned, and executed chemistry experiments, including optimizing palladium-catalyzed cross-coupling reactions, by orchestrating search, code, and lab automation as tools^[10]. Doing the same at a cycler is a couple of years out, not ten.

What a battery manufacturer should do this year

The single highest-impact move in the next twelve months is not buying another point-solution tool. It is building the workforce's fluency in the reasoning and agentic layer, and doing it now, because the returns compound and the talent that can build and supervise agents is already scarce.

The economics are unforgiving. Reducing scrap by 10 percent can save roughly 300 million dollars a year for a 30 gigawatt-hour factory^[11], and with each point of scrap costing on the order of 10 million euros a year^[1], the gap between a fluent organization and a slow one is measured in nine figures. The practical version is concrete. Engineers learn to prompt, verify, and orchestrate rather than take an agent's first answer on faith. Process engineers build small internal agents on their own data and own the result. Treat the agent as a junior colleague who is fast, tireless, and occasionally wrong.

Niobia: the appendages that make the brain reliable

Start with what Niobia is not. It is not a state of health predictor, not the next MES, not a digital twin company. It does not actuate the line, and it does not rewrite the ledger. What Niobia is, is the reasoning and agentic layer for battery manufacturing, plus the appendages that make a language model trustworthy on a production floor.

That last part is the whole game, and there is now hard evidence for why. Anthropic's data team reported that pointing a model at a warehouse and letting it run creates a false sense of precision, and that accuracy in analytics is a context and verification problem rather than a code-generation one. Without a domain skills layer, their agent answered analytics questions correctly less than 21 percent of the time. With it, accuracy ran consistently above 95 percent, and around 99 percent in some domains^[12]. The bottleneck was never writing the query. It was mapping a question to the right entity and verifying the answer.

That finding is the case for Niobia. Its appendages are the battery-manufacturing equivalent of that skills layer: domain grounding from the production floor rather than a generic model on a chat box, native integration with the predictive and simulation tools it calls and retrains, first-class read connectors into MES, ERP, historian, and OT data it never writes back through, and the procedures for DOE, RCA, 8D, and scale-up encoded as skills. Those are what reduce hallucination and raise accuracy. In practice, Niobia cuts the time from defect detection to confirmed root cause by about 50 times, turning a manual RCA of 3 to 5 days into a structured report in minutes, and when a defect matches no known category, it catalogues a new entry in the facility's defect library with its signature, process context, and cell-level outcome. We are building the data-analyst brain for the gigafactory, not another dashboard.

AI in battery manufacturing was never a switch. Most of it is mature machine learning that has run for over a decade, and what is new is the reasoning and agentic brain that reads the floor and the ledger, reads and writes the data and the models, and never bypasses safety-critical control or the source of truth. The substrate and the models will fold into that brain; the PLC and the MES will not, by design. With scrap costing roughly 10 million euros per point per year and a 50-times faster path to root cause on the table, the layer that matters is the brain, and Niobia AI is building it.

About the author

Dr. Gaurav Jha is the Founder of Niobia AI, which builds AI-powered defect detection and process intelligence platforms for battery gigafactories. His PhD focused on fast-charging niobium pentoxide (Nb₂O₅) based nanostructured anodes, with broader research spanning gas sensors, ion sensors, and energy storage materials. At Intel, he worked on wet etch defect reduction in 5nm and 7nm chip fabrication, developing a hands-on instinct for process root cause analysis at scale that translates directly to electrode manufacturing.

He returned to batteries to develop one of the first large-scale lithium-sulfur cathode coatings at Lyten, then moved to Sila Nanotechnology where he worked on silicon anode particles for high energy density and fast-charging applications across consumer electronics and automotive programs. Across these roles, Dr. Jha led manufacturing scaleup from lab to high-volume production, conducted industrial root cause investigations, commercialized key materials products, and developed new electrode chemistries from first principles. He founded Niobia AI to bring that depth of manufacturing and materials science experience into an AI platform built specifically for the production floor.

References

Bockey, G., & Heimes, H. (2024). Mastering Ramp-up of Battery Production [White paper]. Fraunhofer Research Institution for Battery Cell Production FFB and Chair of Production Engineering of E-Mobility Components (PEM), RWTH Aachen University. Link
Doyle, M., Fuller, T. F., & Newman, J. (1993). Modeling of galvanostatic charge and discharge of the lithium/polymer/insertion cell. Journal of The Electrochemical Society, 140(6), 1526-1533. https://doi.org/10.1149/1.2221597
Severson, K. A., Attia, P. M., Jin, N., Perkins, N., Jiang, B., Yang, Z., Chen, M. H., Aykol, M., Herring, P. K., Fraggedakis, D., Bazant, M. Z., Harris, S. J., Chueh, W. C., & Braatz, R. D. (2019). Data-driven prediction of battery cycle life before capacity degradation. Nature Energy, 4(5), 383-391. https://doi.org/10.1038/s41560-019-0356-8
Hofmann, T., Hamar, J. C., Rogge, M., Zoerr, C., Erhard, S., & Schmidt, J. P. (2023). Physics-informed neural networks for state of health estimation in lithium-ion batteries. Journal of The Electrochemical Society, 170(9), 090524. https://doi.org/10.1149/1945-7111/acf0ef
Wang, F., Zhai, Z., Zhao, Z., Di, Y., & Chen, X. (2024). Physics-informed neural network for lithium-ion battery degradation stable modeling and prognosis. Nature Communications, 15. https://doi.org/10.1038/s41467-024-48779-z
International Electrotechnical Commission. (2010). IEC 61508: Functional safety of electrical/electronic/programmable electronic safety-related systems. Geneva: IEC.
International Electrotechnical Commission. (2016). IEC 61511: Functional safety - Safety instrumented systems for the process industry sector (2nd ed.). Geneva: IEC.
International Electrotechnical Commission. (2018). IEC 62443: Security for industrial automation and control systems [standard series]. Geneva: IEC.
Liu, Z., Zhao, C., Iandola, F., Lai, C., Tian, Y., Fedorov, I., Xiong, Y., Chang, E., Shi, Y., Krishnamoorthi, R., Lai, L., & Chandra, V. (2024). MobileLLM: Optimizing sub-billion parameter language models for on-device use cases. Proceedings of the 41st International Conference on Machine Learning (ICML), PMLR 235, 32431-32454. https://arxiv.org/abs/2402.14905
Boiko, D. A., MacKnight, R., Kline, B., & Gomes, G. (2023). Autonomous chemical research with large language models. Nature, 624(7992), 570-578. https://doi.org/10.1038/s41586-023-06792-0
Capgemini. (2024). How to accelerate EV battery manufacturing in gigafactories. Link
Chang, C., Peng, C., Leder, J., Jiao, J., & Cherry, J. (2026, June 3). How Anthropic enables self-service data analytics with Claude [Blog post]. Anthropic. Link