← Back to blog
·8 min read

ICICI Bank Statement Parser: Extract Structured Data with AI

Parse any ICICI Bank statement — savings, current, or credit card — into structured JSON with AI. Handles all formats, net banking PDFs, and multi-page exports.

icici bank statementbank statement parserindian bank apifinancial document extractionai agentfintech indiatypescriptpdf parsing

ICICI Bank is India's second-largest private bank with over 900 million account holders. If you are building a fintech app — loan underwriting, personal finance, expense analytics, or KYC — you will almost certainly encounter ICICI Bank statements from your users.

The problem: ICICI Bank generates different PDF layouts depending on whether the account is savings, current, salary, or credit card, and whether it was exported from net banking, the iMobile app, or sent by relationship managers. A naive OCR or regex parser breaks within weeks as ICICI refreshes its PDF templates.

This guide shows you how to extract structured JSON from any ICICI Bank statement using Lekha — a financial document intelligence API built specifically for Indian formats.

What Makes ICICI Bank Statements Hard to Parse

Before diving into the code, it helps to understand why ICICI statements trip up generic parsers:

Multiple layout variants. Savings account statements from net banking use a two-column layout with a running balance column. Current account statements (especially for businesses) add a cheque number column and MICR code. Credit card statements have an entirely different structure: billing cycle, minimum due, reward points. Password protection. ICICI net banking PDFs are often password-protected using the account holder's date of birth (DDMMYYYY) or a custom password. A parser that can't unlock the file returns nothing useful. Mixed transaction types. A single statement may contain NEFT, RTGS, UPI, ECS, IMPS, ATM, and standing instruction entries — each with slightly different narration formats. Extracting payee names from narrations like UPI/123456789/FOOD/ZOMATO@OKICICI requires semantic understanding, not just regex. Multi-page exports. Active accounts can generate 50+ page PDFs covering a full financial year. Page headers repeat, table rows split across pages, and running totals appear only at the end.

Vision AI handles all of these gracefully. Here's how to use it.

Quickstart: Parse an ICICI Statement in 30 Seconds

Install the Lekha SDK:

bun add lekha-sdk

or: npm install lekha-sdk

Then parse a statement:

import Lekha from "lekha-sdk";
import { readFileSync } from "fs";

const lekha = new Lekha({ apiKey: process.env.LEKHA_API_KEY });

const pdfBuffer = readFileSync("icici-statement.pdf");

const result = await lekha.extract({ document: pdfBuffer, mimeType: "application/pdf", documentType: "bank_statement", });

console.log(result.data);

That's it. Lekha auto-detects the bank, handles password unlocking (if you pass the password), and returns structured JSON. No template configuration, no regex maintenance.

The Full Response Structure

Here is what the extracted JSON looks like for an ICICI savings account statement:

{
  "bank": "ICICI Bank",
  "accountType": "Savings",
  "accountNumber": "XXXXXXXX4521",
  "accountHolder": "Priya Sharma",
  "ifsc": "ICIC0001234",
  "branch": "Koramangala, Bengaluru",
  "currency": "INR",
  "statementPeriod": {
    "from": "2025-04-01",
    "to": "2026-03-31"
  },
  "openingBalance": 42500.0,
  "closingBalance": 187340.5,
  "totalCredits": 2845000.0,
  "totalDebits": 2700159.5,
  "transactions": [
    {
      "date": "2025-04-03",
      "narration": "UPI/345678901/RENT/LANDLORD@OKICICI",
      "type": "debit",
      "amount": 25000.0,
      "balance": 17500.0,
      "mode": "UPI",
      "referenceNumber": "345678901",
      "category": "rent"
    },
    {
      "date": "2025-04-05",
      "narration": "NEFT/AXISBANK/SALARY APR",
      "type": "credit",
      "amount": 85000.0,
      "balance": 102500.0,
      "mode": "NEFT",
      "referenceNumber": "AXISBANK20250405",
      "category": "salary"
    }
  ]
}

Key things to notice:

  • Dates are always ISO 8601 (YYYY-MM-DD), not DD/MM/YYYY
  • Amounts are always numbers, never strings like "₹1,02,500"
  • The mode field is normalized across NEFT, RTGS, UPI, IMPS, ATM, ECS
  • The category field is inferred from narration semantics — salary, rent, food, utilities, etc.
  • Handling Password-Protected PDFs

    Many users download ICICI statements from net banking, which are password-protected. Pass the password in the extraction call:

    const result = await lekha.extract({
      document: pdfBuffer,
      mimeType: "application/pdf",
      documentType: "bank_statement",
      password: "01011990", // DDMMYYYY format is ICICI's default
    });
    

    If you don't know the password (common in B2C apps), you can let the user provide it through your UI, or use Lekha's password hint — ICICI's default pattern is the account holder's date of birth in DDMMYYYY format.

    Credit Card Statements

    ICICI credit card statements have a different schema. Pass documentType: "credit_card_statement" or let Lekha auto-detect:

    const result = await lekha.extract({
      document: pdfBuffer,
      mimeType: "application/pdf",
      // Lekha auto-detects credit card vs savings vs current
    });
    

    if (result.data.documentType === "credit_card_statement") { const { billingCycle, minimumDue, totalDue, transactions } = result.data; console.log(Minimum due: ₹${minimumDue} by ${billingCycle.dueDate}); }

    The credit card response includes:

  • billingCycle.from, billingCycle.to, billingCycle.dueDate
  • totalDue, minimumDue, creditLimit, availableCredit
  • rewardPoints.opening, rewardPoints.earned, rewardPoints.redeemed, rewardPoints.closing
  • Per-transaction merchant, category, emi (if an EMI transaction)
  • Building a Cash Flow Analyzer

    Here is a real-world use case: a cash flow analysis function that takes an ICICI statement and returns monthly income vs expense summaries — the kind of insight a lending agent or personal finance app needs.

    import Lekha from "lekha-sdk";
    

    const lekha = new Lekha({ apiKey: process.env.LEKHA_API_KEY });

    interface MonthlySummary { month: string; income: number; expenses: number; netCashFlow: number; topExpenseCategory: string; }

    async function analyzeCashFlow(pdfBuffer: Buffer): Promise { const result = await lekha.extract({ document: pdfBuffer, mimeType: "application/pdf", documentType: "bank_statement", });

    const { transactions } = result.data;

    // Group transactions by month const byMonth: Record = {}; for (const tx of transactions) { const month = tx.date.slice(0, 7); // "YYYY-MM" byMonth[month] = byMonth[month] ?? []; byMonth[month].push(tx); }

    return Object.entries(byMonth).map(([month, txs]) => { const income = txs .filter((t) => t.type === "credit" && t.category !== "refund") .reduce((sum, t) => sum + t.amount, 0);

    const expenses = txs .filter((t) => t.type === "debit") .reduce((sum, t) => sum + t.amount, 0);

    // Find top expense category const categoryTotals: Record = {}; for (const tx of txs.filter((t) => t.type === "debit")) { categoryTotals[tx.category] = (categoryTotals[tx.category] ?? 0) + tx.amount; } const topExpenseCategory = Object.entries(categoryTotals).sort(([, a], [, b]) => b - a)[0]?.[0] ?? "unknown";

    return { month, income: Math.round(income * 100) / 100, expenses: Math.round(expenses * 100) / 100, netCashFlow: Math.round((income - expenses) * 100) / 100, topExpenseCategory, }; }); }

    This works regardless of whether the statement spans 3 months or 24 months. Because Lekha normalizes all dates to ISO 8601 and amounts to numbers, the downstream logic stays clean.

    Processing Multiple Statements in Parallel

    When a user uploads 12 months of statements (one PDF per month, as ICICI net banking allows), process them in parallel:

    async function extractAll(buffers: Buffer[]) {
      const results = await Promise.all(
        buffers.map((buf) =>
          lekha.extract({
            document: buf,
            mimeType: "application/pdf",
            documentType: "bank_statement",
          }),
        ),
      );
    

    // Merge transactions across all statements and deduplicate const seen = new Set(); const allTransactions = results .flatMap((r) => r.data.transactions) .filter((tx) => { const key = ${tx.date}-${tx.amount}-${tx.narration}; if (seen.has(key)) return false; seen.add(key); return true; }) .sort((a, b) => a.date.localeCompare(b.date));

    return allTransactions; }

    Lekha processes each PDF in memory — no files are written to disk — so this is safe to run in a serverless environment like Vercel or AWS Lambda.

    Accuracy on ICICI-Specific Quirks

    Here is how Lekha handles the edge cases that break other parsers:

    | Quirk | Generic OCR | Lekha | | ------------------------------ | ----------------------- | ---------------------------- | | Net banking PDF layout | ~80% accuracy | 99%+ | | iMobile app PDF layout | Often fails | Supported | | Password-protected PDF | Blocked | Pass password param | | Credit card EMI entries | Misclassified | Correct with EMI flag | | Reward points narrations | Treated as transactions | Filtered out | | Cheque bounce / return entries | Missed | Captured with returnReason | | Multi-page (50+ pages) | Truncated | Full extraction |

    You can test any ICICI Bank statement in the Lekha Playground — no code required.

    What to Build Next

    Once you have structured JSON from ICICI statements, common next steps include:

  • Loan underwriting: Check average monthly balance, salary credit regularity, and EMI-to-income ratio
  • Expense categorization: Group UPI transactions by merchant, flag unusual spikes
  • Tax preparation: Filter by financial year, extract TDS deductions from interest credits
  • Account aggregation: Combine ICICI data with other banks via the Account Aggregator network
  • All of these are much easier when the raw PDF is already normalized JSON.

    FAQ

    Does Lekha support all ICICI account types? Yes — savings, salary, current, NRE/NRO, and credit card statements are all supported. The response schema adapts to the account type automatically. How does Lekha handle statements with Hindi text? ICICI Bank statements are primarily in English, but some older formats include Hindi labels. Lekha's vision AI reads both scripts, so mixed-language PDFs work correctly. Can I extract statements from the ICICI iMobile app PDF? Yes. iMobile exports use a slightly different column order than net banking PDFs, but Lekha handles both. No configuration needed. Is processing ICICI Bank statements DPDP compliant? Lekha processes documents in memory only — nothing is persisted to disk or stored in a database. See our DPDP compliance guide for the full architecture.

    Ready to start extracting? Get your API key and run your first extraction in under 5 minutes at lekhadev.com. The free tier includes 50 extractions per month — no credit card required.