CreditGraph: When SQL Isn't Enough

loan_id	client	balance	status	score
PRE-00313	CLI-00010	$87,200	ACTIVO	725.8
PRE-00127	CLI-00011	$132,511	ACTIVO	798.2
PRE-00254	CLI-00012	$132,511	VENCIDO	649.6
PRE-00087	CLI-00020	$50,922	ACTIVO	710.1
PRE-00308	CLI-00021	$9,200	ACTIVO	695.3
... 453 more rows

01

The safe client who isn't safe

AIG parallel: guarantee chains on CDO tranches

Homero Aguayo has a bureau score of 725. Well above the threshold. His own loan is current, no past-due days. Every standard metric says he's fine.

The graph says: Homero guarantees Manuel's loan. Manuel guarantees Cecilia's loan. Cecilia's loan is 45 days past due.

If Cecilia can't pay, Manuel is liable. If Manuel can't cover it, Homero is liable. His credit score was answering a different question: one about Homero in isolation, not Homero in context.

In 2008, AIG's guarantee chains on CDO tranches worked exactly this way. The depth of exposure was unknowable until the underlying asset failed.

SQL: requires depth known at write time

hop count hardcoded when query is written
chain 3 hops long? this query doesn't exist
SELECT c.id_cliente, c.score_buro
FROM clientes c
JOIN garantias g1 ON c.id_cliente = g1.id_garante
JOIN prestamos p1 ON g1.id_prestamo = p1.id_prestamo
JOIN clientes c2 ON p1.id_titular = c2.id_cliente
JOIN garantias g2 ON c2.id_cliente = g2.id_garante
JOIN prestamos p2 ON g2.id_prestamo = p2.id_prestamo
WHERE p2.estatus = 'VENCIDO'

Cypher: depth is runtime

MATCH (a:ClienteIndividual {
  id_cliente: 'CLI-00010'
})-[:GARANTIZA]->(p1)
  <-[:TIENE_PRESTAMO]-(b)
  -[:GARANTIZA]->(p2)
  <-[:TIENE_PRESTAMO]-(c)
WHERE p2.estatus = 'VENCIDO'
RETURN
  a.score_buro AS guarantor_score,
  p2.saldo_vigente_mxn AS exposure,
  p2.dias_mora

Variable-depth traversal (*1..N) is native in Cypher. SQL requires one JOIN per hop, and the hop count must be decided when writing the query, before you know how deep the exposure actually goes.

$132,511 MXN hidden exposure two hops from a client scored 725. invisible in any loan table

02

Coverage that doesn't exist

AIG parallel: insured tranches AIG had also insured

Three borrowers guarantee each other's loans in a closed loop: A guarantees B's loan, B guarantees C's loan, C guarantees A's loan.

A risk system checking "does this loan have a guarantor?" returns YES for all three. They look fully covered.

But the coverage is fiction. If any one of them can't pay, the guarantee activates for someone already exposed to the same failure. Nobody outside the loop is backing anything.

This is a regulatory violation. CNBV Circular 3/2012 prohibits guarantee structures where coverage is fictitious due to circular dependency.

SQL: recursive CTE (breaks in practice)

postgres only. MySQL breaks here.
no cycle guard: stack overflow on real data
you still don't know what the exposure is
WITH RECURSIVE chain AS (
  SELECT g.id_garante,
         ARRAY[g.id_garante] AS visited
  FROM garantias g
  UNION ALL
  SELECT g2.id_garante,
         visited || g2.id_garante
  FROM garantias g2, chain
  WHERE NOT g2.id_garante = ANY(visited)
)
found the cycle. now what?
loan balances are in another table. start over.

Cypher: one pattern

MATCH (c1)-[:GARANTIZA]->(p1)
  <-[:TIENE_PRESTAMO]-(c2)
  -[:GARANTIZA]->(p2)
  <-[:TIENE_PRESTAMO]-(c3)
  -[:GARANTIZA]->(p3)
  <-[:TIENE_PRESTAMO]-(c1)
WHERE c1 <> c2
  AND c2 <> c3
  AND c1 <> c3
RETURN c1, c2, c3,
  p1.saldo_vigente_mxn,
  p2.saldo_vigente_mxn,
  p3.saldo_vigente_mxn

A cycle is a structural pattern in Cypher: the path ends at the same node it started. No visited-array tracking needed. SQL recursive CTEs approach this problem from the opposite direction and require explicit cycle-checking at each step.

CNBV Circular 3/2012 violation

$110,975 MXN reported as 100% guaranteed. actual external coverage: 0%

03

One company falls. Five people follow.

Bear Stearns parallel: most interconnected, not largest

HARI MASA SA DE CV is under restructuring. In the loan table, that's one row. In the graph, it's a hub.

Five shareholders have personal loans with the same bank. Two cross-guarantee each other's personal loans. If HARI MASA collapses, all five lose their equity stake simultaneously. The two cross-guarantors are doubly exposed.

SQL sees five independent borrowers. The graph sees one interconnected risk cluster where the company's failure is a single correlated event affecting all five at once.

Bear Stearns in 2008 wasn't the largest bank. It was the most interconnected. Size doesn't predict damage. Topology does.

SQL: 3 JOINs, static fan-out

table names baked in, not discoverable
"all connected entities" has no SQL equivalent
SELECT c.nombre, p.saldo_vigente_mxn
FROM empresas e
JOIN accionistas a
  ON e.id_empresa = a.id_empresa
JOIN clientes c
  ON a.id_cliente = c.id_cliente
LEFT JOIN prestamos p
  ON c.id_cliente = p.id_titular
LEFT JOIN garantias g
  ON c.id_cliente = g.id_garante
WHERE e.id_empresa = 'EMP-00031'

Cypher

MATCH (e:Empresa {
  id_empresa: 'EMP-00031'
})<-[:ES_ACCIONISTA_DE|ES_DIRECTOR_DE]-(c)
OPTIONAL MATCH (c)-[:TIENE_PRESTAMO]->(p)
OPTIONAL MATCH (c)-[:GARANTIZA]->(g)
RETURN
  c.nombre_completo,
  sum(p.saldo_vigente_mxn) AS personal,
  sum(g.saldo_vigente_mxn) AS guaranteed

OPTIONAL MATCH fan-out in Cypher naturally discovers all connected entities regardless of how many there are. SQL must specify relationship types and table names before execution. The traversal structure is static, not dynamic.

$4,954,280 MXN guarantee exposure correlated through one restructured company's 5 shareholders

04

Four companies that are really one risk

Lehman parallel: 7,000 entities, one control group

Emilio Maldonado appears in four separate loan files as a controlling shareholder. The portfolio report shows exposure spread across four companies in three sectors. It looks diversified.

The graph shows: it's one person. Score 583. Personal income $18,784 MXN/month. If he has a legal problem, a health crisis, or makes one bad strategic decision, all four companies lose their key decision-maker at the same moment.

Entity-level diversification is an illusion when control is concentrated. SQL sees four borrowers. The graph sees one person.

Lehman Brothers had 7,000 legal entities across 40 countries. The portfolio looked diversified. When the parent failed, all 7,000 failed at once. This is why the G20 mandated the creation of GLEIF in 2011.

SQL: self-join to find shared director

finds the count. not the exposure.
two more JOINs for that. which you haven't written.
SELECT a1.id_cliente,
  COUNT(DISTINCT a1.id_empresa) AS n
FROM accionistas a1
JOIN accionistas a2
  ON a1.id_cliente = a2.id_cliente
  AND a1.id_empresa <> a2.id_empresa
GROUP BY a1.id_cliente
HAVING COUNT(DISTINCT a1.id_empresa) >= 2
still 2 JOINs short of knowing the actual risk

Cypher: pattern is the query

MATCH (c:ClienteIndividual)
  -[:ES_ACCIONISTA_DE|ES_DIRECTOR_DE]->(e)
WITH c, count(e) AS n
WHERE n >= 2
MATCH (c)-[:ES_ACCIONISTA_DE]->(e2)
  -[:TIENE_PRESTAMO]->(p)
RETURN
  c.nombre_completo, n,
  sum(p.saldo_vigente_mxn)
    AS true_concentration
ORDER BY true_concentration DESC

Shared-controller detection in SQL requires a self-join on the relationship table, then additional joins to aggregate exposure. In Cypher, "find persons connected to multiple companies" is a structural pattern: the relationship is first-class, not a derived fact from table joins. Mexican regulation requires this under CNBV Circular Única de Bancos, Article 73 (personas relacionadas).

$7,163,560 MXN true single-person concentration. SQL reports 4 separate borrowers across 3 sectors

SQL sees rows.
Graphs see risk.

Step 1: What the flat table reports

The 4 findings SQL can't see

The safe client who isn't safe

Coverage that doesn't exist

One company falls. Five people follow.

Four companies that are really one risk

The pipeline: 8 scripts, sequential

What this project is, and what it isn't

What it demonstrates

What it doesn't claim

Explore the portfolio graph

SQL sees rows.Graphs see risk.

Step 1: What the flat table reports

The 4 findings SQL can't see

The safe client who isn't safe

Coverage that doesn't exist

One company falls. Five people follow.

Four companies that are really one risk

The pipeline: 8 scripts, sequential

What this project is, and what it isn't

What it demonstrates

What it doesn't claim

Explore the portfolio graph

SQL sees rows.
Graphs see risk.